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lo the backnet, and IP is also used to feed digitised IF to and from third party RF modules, using an open data and control format. 
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Digital Wireless Basestation 

Field of the Invention--. - 

This invention relates to a digital wireless basestation. A basestation is a transceiver node 
„in a radio communications system, such as UMTS (Universal Mobile Telephony System). 
Conventionally, one basestation communicates with multiple user equipment (UE) 
terminals. The term 'communicates 5 and 'communication 5 covers one-way 
communication (e.g. a radio broadcast), two way (e.g. UMTS) and can be one to one and 
one to many. 



Description of the Prior Art 

Digital signal processing in a digital wireless communications basestation is characterised 
by wide (i.e. highly parallel) algorithms with low latencies, high numerical instruction 
loadings and massive DMA channels. This is a demanding environment, traditionally 

15 satisfied by application specific hardware, often using ASICs (application specific 
integrated circuits). These kinds of hardware bsised digital wireless communications 
basestations can take over a year to produce, and have a large development expense 
associated with them. Whilst software architectures have also been used in digital wireless 
communications basestations, they have tended to be very monolithic and intractable, 

20 being based around non object-oriented languages such as C, limited virtual machines 
(the RTOS layer), and non-intuitive hardware description systems such as VHDL. 

The practical result of this is that;base^tatioa L yea^ors have been, able to. force network 
operators into purchasing hardware, software and RF. components together, all too often 
. ^ n a s^b-op^al : cpnfigpratdon Qpsed (or ^effectively clo.se d) interfaces into the 
25 basestations have led to the necessity to use that vendor's base station controllers also, 
further reducing choice and driving down quality. And significant changes in the 
underlying communications standards have all too often required a 'forklift upgrade', 
with hardware having to be modified on site. 
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Digital radio standards (^ucfe,aS'UMXS)Vtte,te^yer so complex and change so quickly 
that it is becoming increasingly difficult to apply these conventional hardware based 
design solutions. The inflexibility of current digital wkeless r coimhuffitati6n^basestation 
designs can be. seen, in the starkest contrasrif one moves' tb -the '■ribn-Mategdus arena of 
5 . . the PG.: The PC offers ah appropriate set of hardware- resources (screen, memory, 
processor, keyboard etc), wrapped up. in a hardware abstraction layer (the Windows™ 
virtual machine), suffi.cient.to meet the demands of a wide range of applications, which 
may then be developed entirely using highrlevel' software. There are. many benefits to 
solving application needs in software — it is fast to produce, relatively cheap to develop 
10 (allowing a wide number of players to enter the market, generating competition), and the 
end product has an almost zero distribution and storage cost. 

The PC is also a generic and extensible hardware design, allowing multiple hardware 
Vendors to build variants and peripherals in competition, driving availability and quality 
up and end-user costs down. 

15 Applying the same paradigm, to the non-analogous digital signal processing ,(PSP) world, 
particularly basestation design, has not occurred.. to date because . the DSP/basestation 
world has an entirely different set of algorithm requirements from, the business/home 
application space. ^ Jt . . v . 

In a first aspect of the invention, there is a digital wireless communications basestation 
* programmed with" a v^ The 
virtual machine layers' suitable for ^ttatbliHg bA^e'bf rriofe baisebahd processing (data flows 
to be represented using high level software,' calling thf ough' f6r liigh-MlPs functions to 
25 7 ■underlying 'engirieV 9 . - ; - J ? (■■ 

In one implementation of the present invention, commodity protocols and hardware are 
utilised to turn the basestation, (conventionally a highly expensive, vendor-locked, 
application specific product), into a generic, scalable baseband platform, capable of 
executing many different modulation standards with simply a change of software. IP is 



BNSDOClD <WO __ 0154300A2J .> 



WO 01/54300 



PCT/G BO 1/00280 



3 

used to connect this device to the backnet, and IP is also used to feed digitised IF to and 
from third party RF modules, using an open data and control format. This approach - 
focussing on moving the basestation into the software arena using commodity hardware, 
decomposition and open standards, promises to provides great benefits, whilst in the 
same time significantly reducing the inherent technology risk involved in taking up new 
communications protocols. These general principles can be enlarged upon as follows: In 
an implementation, the hardware abstraction layer runs on hardware comprising a PCI- 
bus backplane. The use of the industry standard 32bit x 33MHz PCI-backplane makes 
available: (i) a wide range of sophisticated and low cost devices (such as bus-mastering 
DMA< bridge chips), previously restricted to the PC domain; (ii) the PC as a development 
platform (with its wide range of development tools and peripheral support); and makes 
the PC available as a remote monitoring platform. The hardware elements within the 
virtual machine may communicate using an appropriate, architecture neutral messaging 
system. For example, 120 compliant messaging may be used: the use of an industry wide 
messaging exemplifies the general approach of the present invention away from closed, 
proprietary systems, to open systems which can many different suppliers can develop for. 

A further example of this approach is for the RF elements to connect to the basestation 
through an interface which is an open interface. Previously, closed; proprietary interfaces 
have been the noim;' these make it difficult for RF suppliers with highly specialised 
analogue design skills'to develop " products, since" to do so' requires a knowledge of 
: complex and fast changing^ digital baSdstattbn design. But by making the interface an 
open one, RF suppliers can 'finally compete' effectively since they can develop products 
without a detailed knowledge of the underlying and complex requirements of the 
basestation, instead designing RF elements which satisfy a straightforward interface 
specification. The open interface .may define one or more of the following components: 

(i) . power* feed; . ; . - -> : .. . - - 

(ii) '" data; v : ' ■ : 1 ~ 

(iii) controls; 
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(iv) ; riming/synchrptiisation; f ;- . ■ . - 

f: 1 (v) v ' 1 ;; statu$r :: ' ' " ■ :: * ■ c -------' ' 

An implementation, also uses standard IP based protocols: the bases tation sends an IP- 
5 based digital IE feed to. a radio mast. The IP feed is fed up to multiple RF units and the 
IP feed derived from a .signal received at .the mast can be passed down to multiple 
processor boards. Using standard IP based protocols makes available a broad range of 
IP based components .and .expertise, loweripg costs and facilitating, third party design 
contributions. In one preferred implementation, bus LVDS, (low voltage differential 
" 10 signalling) is-used as die.underlying,,bearer for : thp dat^ component ; sent to and from the 
: RF 'heads\ supporting , die R^^ In, another- 

. implementatip.n, a fibre pptic bearer (such as.FiberChannel),is- used as, the bearer. Use of 
fibre optic bearers becomes , more attractive as the distance - betv^eqn the- basestation 
proper and the RF heads increases, and. as die IE bandwidth^ increases (either as a result 
,15 of a higher IF nominal centre frequencyv.^ number of bits used in 

the ADC/DACs, or a combination of both of these factors). 

t . _Th? basest^tion typically cornprises.-a. sc]^ed\4? r » programmed to allow, scalable processing 

using- multiple p^allel processing ^ of 

20 resources to enable it, to ,dyriam^ activity at. runtime. The 

scheduler may read an 'a priori' pgrdqr^g ffle, tq-h^ about which 

datapaths ought to execute OA which.proce^irig % uoit^ . . , 

The basestation may change from operating one set' of baseband processing algorithms 
25 to andtiier set'solely'by changes to"' trie ianderiying 'engines', implemented in either soft 
datapaths or hard datapaths (or a combination of die two), where a hard datapath is a 
flow implemented in an ASIC or FPGA, and soft datapath is. a. flow implemented over a 
conventional programmable DSP. Further, multiple standards can be run simultaneously 
on a single basestation. 
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One foundation feature of the present "invention is We concept of the virtual machine, or 
hardware abstraction layer, as applied to a digital wireless basestation. Appendix 1 
describes in more detail the meaning, purpose and detail of a hardware abstraction layer 
and its general application to two-way broadcast stacks, as are found in a digital wireless 
basetation such as a UMTS node-b. For the purposes of this summary, the hardware 
abstraction layer supports allows the separation of high complexity, but low-MIPs, 
standard-specific code,(which may be written in.an architecture neutral manner) from the 
underlying high-MIPs engines, the implementations of which are tied to particular 
architectures, but which have application across a number of .different communications 
systems. 



More generally, the hardware abstraction layer is software programmed with various core 
processes and/or core structures and/or core functions and/or flow control and/or state 
management: one. of the core processes includes algorithms to perform one or more of 
the following:, source coding, channel coding, modulation; or their, inverses, namely 
source decoding, channel decoding and demodulation. 



An implementation of the virtual machine hardware layer is called the CVM 
(Communications Virtual Machine). The CVM is both a platform for developing digital 
signal processing products and also a runtime for actually running those products. The 
CVM in essence brings the complexity management techniques associated with a virtual 
machine layer to real-time digital signal processing by (i) placing high MIPS digital signal 
processing computations (which may be implemented in an architecture specific manner) 
into 'engines' on .one, side of the virtual ^machine layer and (ii) placing architecture neutral, 
low MIPS code (e.g. the Layer 1 code , defining various low MIPS processes) on the other 
side. More specifically, the CVM separates all high complexity, but low-MIPs control 
plane and data 'operations and parameters' flow functionality from the high-MIPs 
, :. 'engmes^.performing,resource-intensive:(e.g., Viterbi decoding, FFT, correlations, etc.). 
30 This separation enables complex communications baseband stacks to be built in an 



20 



25 
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'architecture neutral', highly portable manner since baseband stacks can be designed to 
run on the CVM, rather than the underlying hardware. The CVM presents a uniform set 
of APIs to the high complexity, low MIPS control codes of these stacks, Rowing high 
MIPS engines to be re-used for many different kinds of stacks (e.g. a Viterbi decoding 
5 engine can be used for both a GSM and a UMTS stack). 

The virtual rriachine' layer supports underlying high MIPs algorithms cornmon to a 
number of different baseband processing algorithms, knd makes these accessible to high 
'level- architecture neutxd 5 pbtentidUy high : complexity control flows 

10 through a scheduler interface, which 1 allows the "xonticbl flow to specify the algorithm to 
be executed, together with a set of resource constraint envelopes, relating to one or more 
of: time of execution, memory, interconnect bandwidth, inside of one or more of which, 
the caller desires the execution to take place. 

15 - During the development stage of j a digital 1 signal protessing jproduct, the MIPS 
1 requirements of various designs of the digital ^ signal processing product can be simulated 
or modelled by the CVM in order tSMehtify die arrangement which gives the optimal 
access cost (e.g. will perform with the minimum number of processors); a resource 
allocation process is used for modelling which uses at least one stochastic, statistical 

20 distribution function (and/or a statistical measurement function), as opposed to a 
deterministic function. Simulations of various DSP chip and FPGA implementations are 
possible; placing high MIPS operations into FPGAs is highly desirable because of their 
speed and parallel processing capabilities. 

.25- ; During acmal operation intelligently allocate tasks in real- 

i: time to computational resources in brddr to -maintain optimal operation. * This approach 

is referred to as - c 2 Phase- Scheduling': ih this specification'. Because the : resource 
. requirements of different engines dSfi- be (i) explicitly modelled at design time and (ii) 
intelligendy utilised during runtime,: it im possible to mix engines from several different 
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vendors in a single product. As" noted above, these engines connect up to the Layer 1 
xontrol codes not directly, but instead through the intermediary of the CVM virtual 
machine layer. Further, efficient migration from the PCT non-real time prototype to a 
run time using a DSP and FPGA combination and then onto a custom ASIC is possible. 

5 

The CVM is implemented with three key features: 

• Dynamic, multi-memory-space multiprocessor distributed scheduler with support 
for co-scheduling. 

• APIs to commonly used, high-MIPs operations for digital broadcast and 
10 communications, with architecture-native implementations. 

• Resource management and normalisation layer (provided over the native RTOS). . 

In a second aspect of the present invention, there is a baseband stack forming the 
baseband stack of a basestation as defined in the first aspect 

15 



20 



In a third aspect of the present invention, there is a design tool for simulating the 
baseband stack of the second aspect, in which the design tool can link together software 
and hardware components using a number of standard connection types and 
synchronisation methods which enable the management of a pipeline to be determined 
by the data processed by the pipeline. The design tool can support stochastic simulation 
of load on multiple parallel datapaths (distributipn to underlying 'engines' of the virtual 
machine) where the effect of the distribution of these datapaths to different positions 
within a non-symmetric memory topology (e.g., some components being local, others 
accessible across a contested bus, etc) may be explored with respect to expected loading 
25 patterns for gjven precomputed scenarios of use. The output of such a design tool is an 
initial .partitioning of the design 'engines' (higli-MIPs components) into variously 
distributed 'hard' and 'soft 5 datapaths (where a hard datapath is a flow implemented in an 
ASIC or FPGA, and soft datapath is a flow implemented over a conventional 
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programmable DSP). This partitioning; i& visible tathe dynamic scheduling engine (by 
means, of which die high layel, architecture, neutral. Software dispatches, its .processing 
requpsts-.to the underlying engines) and, is u^sed/by^t, to. assist ; in : theiproeess of making 
optimal or qlose to optimalirantimesehedu^ • f vr- ::. r; ^ 

In a fourth aspect, there is a method pf r designing part or all of a basestation device in 
which the step of using software programmed a virtual machine layer appropriate to 
baseband signal processing occurs. 

In a fifth aspect, there is computer software suitable for a digital wireless basestation, the 
software operating as a hardware abstraction layer and enabling one or more baseband" 
processing algorithms to be represented using high level software. Preferably, the 
basestation is a basestation as defined in the first aspect. 

In a sixth aspect of the present invention, there is computer hardware programmed with 
the computer software of the fifth aspect. 

In a seventh aspect, there is one or more RF elements suitable for connection to a digital 
radio basestation, in which the basestation is as defined in the first aspect. 

"Further specifics of tfee^iriveriticta and its viatious aspects are contained in the claims. 

Brief Description of the Drawings 

The invention will" be' described with reference to the accompanying : drawings in which: 
Figure 1 is a schematic showing dgorithm scheduling in the GBP™; 
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Figure 2 is a- schematic showing the "GBP architecture ("Generic baseband 
Processor") implementation of the present invention; 

Figure 3 is a schematic' showing how the CVM™ ("Communication Virtual 
Machine 55 ) shields hardware from high level software. 

Figure 4A is a schematic showing GBP RF interfaces, digitised IF feeders and 
third party RF modules; 

Figure 4B is a schematic showing a baseband processing card; 

Figure 5 is a schematic showing the structure in a baseband communications 
stack; 

Figure 6 is a schematic showing the common blocks and. structure in a CVM; 

Figure 7 is a, schematic, showing the relationship. between the CVM, the hardware 
and the stack; 

Figures 8 and 9 are schematics showing steps in the development cycle using the 
CVM.. ■ . ■ . , . , 



Detailed Description. . if 

The present invention will Be described With reference* to an implementation from 
RadioScape Limited of London; Ehgland LL of a software defined radio ("SDR") 
basestation/running over a Generic Baseband Processor ("GBP™"). The bases tation is 
a UMTS node-b. As' : noted above, the essence of the RadioScape approach is to use 
commodity protocols and* hardware to tiirn"a'basestation, previously a highly expensive, 
vendor-locked, application specific product, into 1 generic/ scalable baseband platform, 
• capable of -executing many . different modulation standards with simply a change of 
25 software. In the RadioScape system; IP is used to connect this device to the backnet, 
- - and IP'is'also used to feed digitised IF "to arid from third party RF modules/ using an 
open data and control format. 
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The SDR-based UMTS node-b basesta^Qn is a software description (in C++, DSP 
assembler and Handel-C / VHD^.ru^^ The.GBP is a powerful 

hardware platform designed to provide the MIPs and diroughput required for wireless 
communication digital signal processing tasks. It connects to the network infrastructure 
5 using IP, and communicates with an RF module or modules via an IP bus carrying 
digitised IF signals using RTP (Real Time Protocol) over UDP.' Onboard processing 
resource is provided by a number of FPGAs (field programmable: gate arrays) and high- 
specification DSPs (digital signal processors) . .In an optimised example, some or all of the 
hard datapaths on the FPGA may be considered to be migrated over to an ASIC for cost 
10 efficiency. RadioScape's runtime, the CVM (or Communication Virtual Machine) 
provides the hardware abstraction layer, lying above the system RTOS (which is third- 
party); the CVM aUows the ^ of the stack to be called in a platform^ 
neutral manner. The node-b:* control flow code itself then executes over the CVM on the" 
GBP. 

15 ■ ,: . ■;■ : . : . ^ . ; : 7:.., . :V . , ■ ■ ' J, . < ■ . 

A set of control APIs is available by means of which data and software providers can 
'hook into' the UMTS network. The point of this enterprise is that, although the whole 
3G development has supposedly been driven by the needs of data (higher bursty 
bandwidth for IP packet data across increasingly flat backhaul cores), in fact it is rather 

20 difficult, as a software or data vendor, to make use of the!¥ac'iliues bffered by the 
underlying . .network. To , this . end; :> Jladio§p ape's., APIs provide an : open, COM 
? , (Component Object Model), and. SNMP (Simple 

Network Management Protocol)7basfd^ste programmers 
may connect.vrit^Juui^ Through the yuse of 'drivers' 

25 .. . this framework may be , implemented over any.high-bandwidth .network (e.g., CDMA- 
2000, Bluetooth, etc.) and, may also t>£ jmpleinented for any /vendor's implementation of 
; _ a UMTS 3 G network. .As noted ^boye,, the BP interface (contxpl, timiiag synchronization 

. i ..... a#d digitised IF) will be, completely Qpen and published by RadioScape, hence 'shopping 
. . around' for: the best RE provider wp ..become a t reality, for network, providers utilising the 

30 GBP paradigm. , , 



8NSDOCID: <WO 0154300A2 J„> 



WO 01/54300 



PCT/G BO 1/002X0 



11 

GBP Paradigm 

Everyone is familiar with the success of the PC. The reason for this success is that it 
offers an appropriate set of hardware resources (screen, memory, processor, keyboard 
etc.), wrapped up in a hardware abstraction layer (the Windows virtual machine), 
sufficient to meet the demands of a wide range of applications, which may then be developed 
entirely using high-level sopvare. And there are lots of benefits to solving application needs in 
software - it is fast to produce, relatively cheap to develop (allowing a wide number of 
players to enter the market, generating competition), and the end product has an almost 
zero distribution and storage cost. 

The PC is also a generic and extensible hardware design, allowing multiple hardware 
vendors to build variants and peripherals in competition, driving availability and quality, 
up arid end-user costs down. An insight of the present invention is that it would be 
attractive if a similar paradigm could be applied to the digital signal processing (DSP) 
world. Unfortunately, however, until recently this fraternity has been operating in the 
equivalent of the stone age, cut off from the PC platform because it has an entirely 
different set of algorithm requirements from the business / home application space. As 
noted earlier, the need for wide (i.e.-, highly parallel) algorithms with low latencies, high 
numerical, instruction loadings and massive DMA channels, has tended to lead to the 
development of application specific hardware; often using ASICs. These devices can take 
over a year to produce, and have a large development expense associated with them. 
Furthermore, such software arcWtectures as dp exist have , tended to be very monolithic 
and intractable, being based around non object-oriented languages such, as C, limited 
virtual machines (the RTOS layer), and non-intuitive hardware description systems, such 
asVHDL. 

The result of all of this is that for complex systems such as wireless communications 
basestations, vendors have been able to. force . network operators into purchasing 
hardware, software and RF components together, all too often in a sub-optimal 
configuration. Closed (or effectively closed) interfaces into the basestations have led to 
the necessity to use that vendor's base station controllers also, further reducing choice 
:md driving down, quality. And significant, changes in the underlying communications 
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standard have all too often required a 'forklift upgrade', with 5 hardware having to be 
modified on site., r ... ;= 

1.1. Putting it Together - The CVM and GBP 

As we have seen above, a key concept is that a well-defined hardware architecture, 
5 wrapped in an appropriate virtiaal machine, .can allow npLOst,or all complex -baseband 
processing algorithms, including those for UMTS, to be. represented using high-level software ^ 
with all the advantages that this entails for rapid development, fast modification time, 
encapsulation, etc. 

The hardware we term the generic baseband processor or, GBP. The. hardware abstraction* 
10 layer we term the communications virtual machine^ or, -CVM. Taken together,, they form a 
platform supporting modulation stacks as pure software components. Let us now look in 
little more detail at how this architecture 

;The GBP ^jvill utiHse a conventional -PGI^bus backplane. This is a well defined, relatively 
high, bandwidth-standard,; for which sophisticated bus-master^ such 
f 15 . as the PLX-9080, < are readily available. -at -low cost. The initial GBP will use the 
■conventional' 32 bit x 33 MHz PCI<btisvbut subsequent versions may utilise the faster, 
: <; vwider bus. configurations if ^necessary, a :>v,:6 'b .. 1 r. 

Thie ; industry "Standard" I20 J me'ss'aging "tayef "wiil' b'e^supportek' over the PCi bus, as an 
additibnail abstraction layer, knowing vanous underlying communications topologies to be 
20 " used (e.g: PCI, RaceWa£ : ett:) ; !XT ni " 0 ° ' " ^" e! M ' ; :r " ^ 

Another advantage of the PCI architecture is that it is supported by PCs. Although the 
PC is' by no means s'liitabl^'fer uSe ? ~aS*'"i&e direct substrate for baseband processing (it is 

' "" too litem, too c'bstly, and n6n-^ar^el 5 " and rtins'Wmdows^ an inappropriate virtual 
J machine), it nevertheless provides an excellent platform for remote monitoring of the 

f 25 platform, has unparalleled peripheral support,' and is provided with industry-leading 
' development tools. Therefore, the first component of the GBP is a plug-in PC card, such 
as tht provided' by Advantech, and used successfully by RadioScape in other mission 
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-critical applications (e.g., E-147 digital broadcasting multiplexers). The card will run NT, 
but will not be critically involved in' the mainstream operation of the GBP; rather, its 
Functions will involve' booi control, peripheral and processor card configuration, and 
remote monitoring support, in addition to provision of the bus-mastering fast Ethernet 
IP interface onto the backnet for incoming and outgoing Iub messages. 

The GBP's mainstream functioning will be carried out by one or more generic processing 
modules, which will be supplied as standard design PCI cards, initially produced by 
RadioScape. Each card will contain a high-speed C64x TI DSP, a Xilinx multi-million 
gate FPGA, 32 MB of SDRAM, and a PCI bus-mastering bridge chip (optionally, the 
PCI interface of the Xilinx part may be used, as discussed below). The FPGA will be 
: programmed at boot time (or afterwards) by the PC module, possible because it's control 
ports will be mapped into the memory space addressable on the PCI" bus by the bridge" 
chip. The TI DSP will be programmed, at boot in the same manner. In carrying out the 
normal operation paradigm, data will enter from the IP port (supported over the fast 
Ethernet protocol on the PC card) and. get DMA'd into the memory of the specified 
default processing module, which has the task of running the high-level IP message 
parsing/formatting code (using, defined. ASN.l maps for the IuB messages), and the 
scheduler. 

The scheduler maps requests to execute, a specified algorithm, with specified input data, 
processing requirements and constraints . (e.g., priority) onto an execution request for a 
particular instance of that algorithm, on. a particular device . (DSP or FPGA) on a 
particular processing board. The process js. shown in the. diagram at Figure 1. Note that 
the scheduler is aware.of the initial, a priori partitioning decisions made, during the design 
- PI****?? but that it need not simply, follow a. complete timing model defined during that 
25 design process - there is a significant 'runtime' aspect to. the. data flow. 

° nce me decision for execution-has been made, the scheduler writes an identifier record 
for the memory, block in question into a queue (using mapped memory across the PCI 
bus, ultimately using the 120 messaging, interface) on the target processor card. Each 
instance . of each algorithm on .the card wilj maintain its own queue, and the scheduler will 
be informed, about the logical .configuration of the GBP (its installed cards and 
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algorithms) by a physical configuradon jfUe,. generated as part, of. the a prior datapath 
partitioning design flow, as discussed above. ^Updates to the queues Qn sa, ca|:jd; may be 
signalled by an interrupt on the PCI bus upon completion; access ;tq v the queue, memory 
will be protected by a mutex enforced by the PCI bridge device. . <x ... 

Each algorithm instance on a givien card blocks until* it discovers one or more- memory 
block identifier records (MBIRs) in its* queue. /Upon discovery of such a record, it will 
DMA the data from its current location (specified in the MBIR), which may be located in 
the bus-exposed memory map of anpther card, into its local working store. Transfers 
between algorithms on the same card are qptimised out and the scheduler will be able to 
take, in to account hints about likely next algorithms, to call in. order to maximise the 
probability of this happening, given the current .physical. cpnfig^arion of ( the, system. 

The, diagram 1 at figure. 2 shows the high-level hardware "architecture of the GBP 
(excluding the, specialised Ip processing: card); -r^:.:-, 

Once the" data has : been' transferred 5 to ; local memory, tie origin me'm block will be 
freed up for reuse (assuming diat ah ^inter-card'DMA has been needed) and "processing 
will beigin on the data. Processing' of Vanbiis algoridims on the FPfiA can, of course, 
happen truly in parallel, ^subject td contention for Access to the on-card memory. 
Processing of algorithms on the DSP will take place under the supervision of a 
.multitasking RTOS (red- time,: operating system) suchias TI's DSP BIGS* - * 

' Note that ilieidatabibck fair the "algorithm v&ll a&6 contain the parameters block, which 
will be used' to initialise it. Th£re "is alsb '&e cdncept of session stdfe, which is maintained by 

;: the scheduler. Higher level cbde^eafr* a2c£sk 'ah API to open new sessions, obtain session 
ids, and close a session. Ldgicftf ' 6p6tzixdt£$ datf then £>6 scheduled with a constraint that 

' they execute ^ttiln die same se&sidri (&Kith wifffessentialiy constrain them to execute on 
the same physical card, if possible; tb ! prevent ^ state having to be DMA'd 

around). The ^gorithtos- them^ to go alongiwith an Executing 

session algoridim,. which wiU ,be.PJyI^'d-ta. the;, next board's, memory space in the case 
where a follow-on. call cannot be; sehedxded. pn the same physical boards ._, 

RadioScape's CVM : (communication virtual machine) will execute over the board on each 
processor, providing a common 1 environment l f6z the ldw4ievei operatiohs to execute 
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within, allowing access to the scheduler data, session state, common DMA channels etc. 
:The resource-intensive algorithms themselves will, for the most part, be embedded as 
implementations of generk>signai processing algorithms exposed by the CVM APIs. The 
CVM shields hardware from the high level software, as schematically shown in Figure 3. 

5 Inherent in the GBP architecture are concepts of redundancy support, with multiple data 
paths being available, ability to act as the hardware substrate for multiple modulation 
standards, and the ability to change code loads at will (e.g. new code, including new or 
updated modulation standards, can be updated at the basestation remotely), remotely via 
the IP network. 

.10 To change (e.g.) the deployment of algorithms across processors, the target processor 
will first be decommissioned, by uploading a new physical mapping file that does not 
include any entries for that device. Then, when all pending algorithms assigned to either 
the DSP or FPGA on the target board have cleared, the PC card will DMA data (whether 
new machine code for the DSP or a fuse map for the FPGA) into the device, and then 

15 reactivate the card for processing. As a final stage, the physical mapping will be modified 
once more to reflect the availability of the new algorithms, which will cause calls to be 
scheduled to the board once again. If redundancy is utilised, then the only effect on the 
GBP during the reprogramming period will only be one of overall capacity (and even 
then, with simple N+l hardware redundancy, this problem may be obviated, simply by 

20 reconfiguring the backup card instead; and then making the card with the 'old' load the 
logical backup in its place).'' • < . - : '<- • 

Two versions of the system are envisioned, one with the ability to 'hot swap' PCI cards 
themselves (which involves bridges for each card on the PCI backplane) and the other 
with longer bridged sections, which SvilT be a cheaper alternative (but will sacrifice 
25 , : flexibi]ity v since in the case of. a hardware failure the whole GBP will require powering 
down before it can be replaced). , , - . . . 

During use, the PC code will run 'heartbeat' tests on ail the cards, and report any failures 
using SNMP. The PC card is itself protected against hanging by a \vatchdog timer. 

Because the processing cards and (potentially) the backplane and PC card are generic 
30 devices, spares holding is much simplified. The CVM provides developers with a-priori 
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resource prediction capabilities,, greatly a$sistin^m^^ for -deployment 

.to particular tasks. Another advantage of .the GVM.^ device 
: platform and int^rconneqt;'.(pri^tives, ^enabliiag-^e^)^ Ae> ! swit€fav^o;iarger-, : gat:e FPGA 
, boards when; the?e become available. ; : : ?i '\-/V0 

5 1.2. Interfacing to the RF Module(s) 

The ultimate point of the GBP is to execute high-bandwidth layer-L air-interface 
algorithms in a flexible software-defined manner. Therefore, a critical part of the GBP 
design is the method by which it interconnects to the radio frequency (RF) elements (by 
which we imply all of the up and downcbnVersibn elements and power amplification). - 

10 In an ideal world, RF data would simply be digitised directly from the antenna, and 
synthesised directly at the target centre frequency. Unfortunately, current ADC / DAC 
and signal processing substrates are insufficient to realise this. Therefore, we do require 
some hardware to perform the tasks of upconverting data for output to the target centre 
frequency, then amplifying it for transmission, and similarly downconverting input data 

15 to an appropriate IF (intermediate frequency) at which it may be digitised. 

, Further complexity is added by the desire to use simple antenna diversity on transmit 
. . : (same analogue stream time locked tp. .multiple output points) , 'smajrt 5 antenna arrays 
(where a grid of output values is computed and transmitted to;a :: number of DACs), and 
input diversity (where the input from multiple antennas is accepted and subsequendy 
20 combined, in order to mitigate the effects of channel fading}. 

The overall RF interfacing architecture -is showp, in Figure 4A. 

A core design philosophy fdr the GBP is that RF modules (and subsequent amplification 
and antenna stages) will be provided by appropriate components houses with the 
, necessary design skills for ; analogue engineering, but whojSnd the prospect of the level of 
25 digital baseband software design required to implement complex- algorithms like UMTS 
layer 1 extremely daunting. To this end, an open interface is specified between the GBP 
and the RF module. 



BNSDOC1D: <WO 0154300A2_I_> 



WO 01-/54300 PCT/G B0 1/00280 



The interface between the RF .modules: arid the GBP -therefore has-five components - 
- potver feeds (straightforward), data (high bandwidth digitised IF -data passing in both 
•directions), control, (messages, from the GBP to the-RF for such purposes as setting centre 
-frequency for output, changing amplification levels, etc, status and alarm messages passed 
5- back from RF to GBP), and a timing / sync signal from GBP to'RF module (to enable 
operations to be carried out relative to a particular time code). Within the GBP, this 
timecode can either be provided through the use of an external 1PPS signal from a GPS 
unit into the IF card, or by using the network time protocol to provide long-term 
estimates into the card. The card itself contains a high precision TCXO which is divided 
10 down and then locked to either the GPS or NTP signals. Figure 4B is a schematic of 
the baseband processing card. 

SNMP shall be used as the:m^ssagd encoding for control, status, arid alarm 1 messages? 
This shall be implemented over a fast IP channel, which may be selected from a range: 

« Fast Ethernet 

15 • Gigabit Ethernet 

• Bus LVDS 

• FiberChannel 

• Firewire, etc. 

Due to the large step-up in processing required by the final stages of data processing for 
20 output to air / input from air in wideband systems such as WCDMA, and the high 
bandwidth of data DMA required in such systems, the PCI bus will not be used as the 
default IF transport channel; rather, a special IF version of the generic processing card 
will be provided, which will contain, the high-bandwidth digital IF-baseband and 
baseband-IF modules (e.g., raised root cosine filtering, implemented on an FPGA), the 
25 . timing system mentioned above, and the high-bandwidth IF<->IP controller. Bus LVDS 
(low voltage differential signalling) will be the initial system of choice for the UMTS 
node-b implementation where relatively short distances (<= 10m) are expected between 
the basestation processing unit and the antenna. 
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This architecture miiumiSfes the loachon; the PGI fefls and allows for the distribution of IP 
'digital feeders 5 ,up, the mast; . to the RFdhardware; eliminating problems due to heat 
. .expansion ./, contxaelion ahd loss experienced .with conventional .analogue feeders. Use of 
IP broadcasting on, this connection allows multiple RF units to share* the' sari$e input if 
_S desired, for transmit diversity -purposes. Therefore transmit diversity can be "managed 
either with conventional multiple analogue feeds from the one RF unit; or with multiple 
RF units attached to. the same digital feeder. r 

Synchronisation of output will be performed using RTP over UDP/IP for the packets 
with a lpps signal distributed from the RF card along a" separate coax feed. At the RF 
10 ! interface, this will control the loading of datk from the UDP/IP' packets into the DACs. 
Control information will be sent in timestamped SMTP' messages 'and'will be similarly 
- applied at the appropriate mon^ept by: the RF.mQdulei/ amplifier; : ■} ^" r ~ 

Because 1 of the open ihterfacie, using accepted standards ^th a' 'digital TP transport, it will 
become possible to procure RF modules for a particular frequency /; power requirement 
15 from an appropriate supplier independently of the baseband processing code. This has 
the potential to provide increased quality and better pricing for network commissioners. 

1.3. Air Interface Standards 



RadioScape's Nocle-B will be compliant, and it will provide the hardware 

and software for this air interface. However RadioScape's hardware will be re- 

2tf " configurable for the 2G/&SM^ DTT technology', provided that 

the appropriate application-specific code loads are available, and necessary RF adapter 
modules provided. This is one Of the advantages of the GBP concept, and fits well with a 
goal of shared* transimssion tower sV keeping the same hardware For multiple air-interface 
standards* also allows simplified spares holding and redundancy management for the 

25' network provider. 

RF Unit Implementation issues 
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1.4. System connection 

As has been discussed earlier, distribution from the GBP IF card to the RF unit will have 
five components:. 

o Power feed (straightforward). 

5 o High speed low-IF sample data (either going to a DAC or being sent from an 

ADC). This information will be transmitted using UDP/IP. Packets will carry 
timestamps according to the 'Real Time Protocol' (RTP). Bus LVDS will be used 
as the initial underlying transport. 

© SNMP management messages used to configure the performance of the RF 
10 module, and sent back to provide status about the RF module (hence this counts" 

as two components). RadioScape will publish a MIB for this interface. It will be 
transmitted over the same bus LVDS link as the data to save. wiring complexity. 
SNMP messages will contain ;an RTP timestamp field allowing commands and 
messages to utilise the same timebase control as the sample datastream. 

15 o A lpps coax distribution used to synchronise clocks. ITiis will be generated from 

the master IF card on the GBP, either as a passthrough of an external lpps from 
. * a GPS unit (preferred) , or else as the output, of a local, onboard clock conformed 
to.a . NTP message from the- main - distribution network (this will not be 
sufficiendy accurate for fine-grained; location services, however); 

20 At the RF module, a small, low-cost processor (e.g. an ARM) will decode the control 
messages and manage the timed updates to core parameters (e.g., centre frequency, 
output RF power, etc.). Each update will be locked to an RTP clock ultimately set to the 
lpps feed. 

It is appreciated that some degree , of complexity is added through the use of IP here. 
25 However, it has the benefit that a great number of transports, some highly ubiquitous 
and cost effective, may be utilised for connection. RadioScape intends to support at least 
Gigabit Ethernet for this IP connection initially. 



BNSDOCJD: <WO 01 54300A2_I_ > 



WO 01/54300 



P.CT/GB01/002X0 



20 

RadioScape will provide an RF card design pack, including schematics, ARM code and all 
necessary IP drivers, MIBs and timing diagrams, under NDA, to any interested party 
who wishes to build an RF module that will, interconnect .with the .GB.E... . , 

Note that although the discussions here assume that the RF head wili beva it 
5 is entirely possible to use die GBP as a transmit only, or .as a receive only, substrate for a 
particular standard where this operation is appropriate (e.g., a broadcast system such as 
DAB or DVB-T). Note also that multiple standards may be executing simultaneously on 
a single GBP, given sufficient processing and memory resources, and sufficient interface 
bandwidth. 

10, 1.5, • RF Module ... , ,.b : v,;."- ;•:„-:, — : ' 

It is intended, that this architecture will enable the - RF module to be sited- very close to the 
. antennas, thereby obviating the requirements for lengthy analogue feeders. There is some 

additional > cost' and' complexity ^ over IP, but 

.because cp 

15 Clearly^ the power, amplification: requirerriems for the RF module to some extent will 
•deterinine, for a particular RF architecture target, whether or not it is possible to site the 
° full headend- at the; top of ^the tower;; but f for most systems (including UMTS) this will 
indeed be possible. -The use of smate&mennas is ; also facilitated bf this architecture, 
. provided that. the IB network . us.ed for I£ (distributiop ha,s sufficient, bandwidth to carry 
2Q. the ^modulation payload for ; each of the ..cQn^pent^antejcina segments. . 

Appendix 1: the CVM 

The CVM, or Corhmuhicatibhs Virtual Machine, is; a foundation ' to the present 
y invention. This appendix descfife'eV' it 'in- and its application to two-way broadcast 
25 baseband stacks, i.e. as found in a bas'estation/ih more detail. 
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Technology Background: digital signal processing, DSPs and baseband stacks. 

■Digital signal processing is a process of manipulating digital representations of analogue 
-and/or digital quantities in order to transmit or recover intelligent information which has 
been propagated over , a channel. Digital signal processors perform digital signal 
processing by applying high speed, high numerical accuracy computations and are 
generally formed as integrated circuits optimised for high speed, real-time data 
manipulation. Digital signal processors are used in many data acquisition, processing and 
control environments, such as audio, communications, and video. Digital signal 
processors can be implemented in other ways, in addition to integrated circuits; for 
example, they can be implemented by micro-processors and programmed computers. 
The term 'DSP' used in this specification covers any device or system, whether in 
software or hardware, or a combination of the two, capable of performing digital signal" 
processing. The term 'DSP' therefore covers one or more digital signal processor chips; 
it also covers the following: one or more digital signal processor chips working together 
with one or more external co-processors, such as "a FPGA (field programmable gate 
array) or an ASIC programmed to perform digital signal processing; as well as any Turing 
equivalent to any of the above; - ' ."■ . ■ , 

In the communications sector, a DSP will be a critical element for a baseband stack as 
the baseband stack runs on the DSP; the stack plus DSP together perform digital signal 
processing: The term *baseband !' stack* ' used in this specification means a set. of 
processing steps (or the structures which perform the steps) including one or more of the 
following: source coding, channel coding, modulation^" or their inverses, namely source 
decoding, channel decoding and demodulation. In addition, the term 'baseband stack' 
should be construed as including structures capable of processing digital signals without 
25 any form of down conversion; a software radio would include such a baseband stack. As 
will be appreciated by the skilled implementer, source coding is used to compress a signal 
(i.e. the source signal) to reduce the titrate!' Channel coding adds structured redundancy 
to improve the ability of a decoder to extract information from the received signal, which 
may be corrupted. Modulation alters an analogue waveform in dependence on the 
30 information to be propagated. 
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,,, Baseband stacks-are fo\imd in- mobUe; telephones (e.g/a:G^ and 
digital .-radio, receivers, (e.g.- a DAB . stack),, as ^eU^s'^idthgf'^ne-^B -ntfc^way digital 
communications deuces. The term ^communications' used ( in tMs specification- covers all 
5 forms* of one pr >two way, ... one to .one and one :to many communications : and 
broadcasting. The .terms ^designing' and 'modelling' typically 1 includes the processes of 
.. one or more of emulation, resource calculation, diagnostic analysis • hardware sizing, 
debugging and performance estimating. ...u- : > / ^ 

10 The increasing complexity of communications systems places Intense pressure on 
baseband stack development _ 

The complexity } of ^communications, systems is increasing pn an almost daily basis. There 
are a number, of drivers, for this: traffic pn .the Internet is increasing at 1000?/o pa. Much 
, pf this (largely, bursty) data, is moving^tq tireless ,icarriers,-i)ut; there is less and less 

15 .spectrin available on. which, to host > s^ch,sfryicjes. rThese facts have led to the use of 
ever more complex signal processing algorithms, in order to squeeze, as much data as 
possible into the smallest possible bandwidth.. In fact, the complexity of these algorithms 
has been increasing faster than Moore's law (i.e. that computing power doubles every 18 
months), with the result that conventional DSPs . are becoming insufficient. For complex 

20 terminals, therefore, an ASIC must be produced to manage the vast parallel processing 
load involved. However, this is where the problems really begin. For not only are the 
algorithms used more complex on the signal processing front; the use of bursty, variable- 
QoS, often ephemeral transport channels, mandated by the move from primarily voice 
traffic to primarily Internet-related traffic, needs ever more sophisticated control plane 

25 software, even at Layer 1 (which requires hard real-time code). Conventional DSP 
toolsets do not provide an appropriate mechanism to address this problem, and as a 
result many current designs are not scalable to deal with 'real world' data applications. 
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However, the .high MIPs requirements .of modern communication systems represent only 
part of the story. The other problem arises when a multiplicity of standards (e.g., GSM, 
IS-136, UMTS, IS-95 etc.) need to be deployed within a single SoC (System on a Chip). 
SoC devices supporting multiple standards will be increasingly attractive to device 
vendors seeking to tap efficiently different markets in different countries; also, it is 
expected that the next generation UMTS phones will have not only GSM (or current 
generation) capabilities but also added features, such as DAB (Digital Radio 
Broadcasting) receivers, hence requiring baseband stacks for UMTS, GSM and DAB. 
The complexity of communications protocols is now such that no single company can 
hope to provide solutions for all of them. But there is an acute problem building an SoC 
which integrates IP from multiple vendors (e.g. the IP in the three different baseband 
stacks listed above) together into a single coherent package in increasingly short 
timescales: no commercial" system currently exists in the market to enable multiple 
vendors' IP to be interworked. Layer 2 and layer 3 software (generally, soft real-time 
15 code) is more straightforward, since it may simply be run as one process of many as 
software on a DSP or other generalised processor. But layer 1 .IP, (hard r real time, often 
parallel) algorithms, present a much more difficult problem, since the necessary hardware 
acceleration often dominates the architecture of the whole layer, providing non-portable, 
fragile, solution-specific IP. 
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•• 1 ' Overview of deficiencies in cxirferit mbdels of baseband stack development 

In the past, baseband stacks have " been relatively simple, the amount of required high- 
MIPs functionality has been relatively small and' only modest amounts of multi-standard, 
multi-vendor integration have been performed. But as noted above, none of these now 
25 apply: (a) the bandwidth pressure means that ever more complex algorithms (e.g., turbo 
decoding, MUD, RAKE, etc.)' are employed, necessitating the use of hardware; (b) the 
increase in packet data traffic is also driving" up the complexity of layer 1 control planes 
as more birth-death events and reconfigurations must be dealt with in hard real time; and 
(c) time to market, standard diversification and differentiation pressures are leading 
vendors to integrate more and more increasingly complex functionality (3G, Bluetooth, 
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802.11 v etc.) into a single device in ^ redord? t^'e' — ^necessitating' A 1 IP 

. to produce an SqC (system on chip) for a paiticlilar'target application: i J 

Currendy, there is no adequate solution for this problem; the VHDL toolset ..providers 
5 (such as Cadence and Synopsis) are approaching it from the 'bottom up' — their tools are 
effective for producing individual high-MIPs units of functionality (e.g., a Viterbi 
accelerator) but do not provide tools or integration for the layer 1 framework or control 
code. DSP vendors (e.g., TI, Analog Devices) do provide software development tools, 
but their real time models are static (and so do not cope well with packet data burstiness) 
10 and their DSPs are limited by Moore's law, which acts as a brake , to, their usefulness. 
Furthermore, communication stack software is best modelled as a state machine, for 
which C or C++ (the languages usually supported by the DSP vendors) is a poor, 
substrate. . 

15 Detailed analysis of deficiencies in current models of baiseband stack 

* * development-'-' "■■ /' ■ "•• - r .^v- •; '{ ■ ■ ■'' • - v ' 

Conventionally, baseband stack development for digital communications is fragmented 
and highly specialised. For example, the initial development of the signal processing 
algorithms that are the heart of a baseband stack is generally performed ofi a 

20 ^m^&epati^^ Ma4ab) v ,,3M^bfitting -to- a "particular 

memory and MIPs (Million Instructions per Second) budget for the final target DSP 
being done by skilled estimation usirig a conventional spreadsheet. Onc£ this modelling 
process has been performed satisfactorily, code modules and infrastructure software for 
the stack will be written, adapting existing libraries where possible (and possibly an 

25 RTOS (Real-Time Operating System)). Then, a 'real time\ prototype hardware system 
will be built (sometimes called a 'rack') in which any required hardware acceleration will 
be prototyped on PLDs (Programmable, Logic Device} where possible. This will be 
tested off air, and necessary changes .made to the code. Once satisfactory, the stack will 
be 'locked off and the final ASIC (Application Specific Integrated Circuit) (incorporating 
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the hardware acceleration modules as on-chip peripherals) will be produced. The 
resultant baseband DSP or DSP components is then tested and then shipped. 

There are a number of problems with this 'traditional 1 approach. The more important of 
these are that: 

• The resulting stacks tend to have a lot of architecture specificity in their construction, 
making the process of porting 1 to another hardware platform (e.g. a DSP from 
another manufacturer) time consuming. 

• The stacks also tend to be hard to modify and 'fragile 1 , making it difficult both to 
implement in-h.ouse .changes (e.g., to rectify, bugs or accommodate new features 
introduced into the standard) and to licence the stacks effectively, to others who maf 
wish to change them slighdy. 

• Integration 'with the MMI (Man Machine Interface) tends to be poor, generally 
meaning that a^ separate microcontroller is used for this function within the target 

15 device. This increases chip count and cost. 

• The process is quite slow, with about 1 year minimum elapsed time to produce a 
baseband processor for a significantly complex system, such as DAB (Digital Audio 
Broadcasting). ' .: : ■* 

• The process puts a lot of stress 6 : n technical authorities - so called 'gurus' - to govern 
20 the overall best way to allocate buffers, manage downconversion, insert digital filters, 

: generate good channel models anti~so r oh. This is generally a disadvantage since it 
adds a critical path and key personnel dependency to the project of stack production 
and lengthens timelines. The resulting product is quite likely not to include all the 
appropriate current technology because no individual is completely expert across all 
25 of the prevailing best practice, nor will the gurus or their team necessarily have time 

to incorporate all of the possible innovations in a given stack project even if they did 
know them. 
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• The reliance on manual corhpiitaLtio'n oFMIPS and memory requirements; and the 
bespoke nature -of thd-DSP modules and irifrasthacture code for this static, means that 
there is an increased probability of error in the product. 

• An associated pointis that. generally, real-time prototyping of the stack is not possible 
5 until the 'rack' is built; a lack of high-visibility debuggers available even at that point 

means that final stack and resource 'lock off is delayed unnecessarily, pushing out the 
hardware production time scale. High visibility debuggers would, if available, be very 
useful since they provide, when developing in a high level language like C++, the 
ability in the development tool to place break points in the code, halt the processing 

10 at that point and then examine the contents of memory; single step instructions to 

see their effects,' etc: 1 Triggers can then also be placed in the code that will stop 
execution and start up the debugger when particular conditions arise. These are very 
powerful tools when developing application software. c Lock-6ff refers to the fact 
. r ,. that when one phase of the project is t /complete, development can move onto the 

15 next. In a. hardware, development you, eannot-iterate as easily )a% in software as each 

iteration requires expensive or tirne consuming fabrication.. 

• Because it is likely j that, low-level modules i or hardware; acceleration 'controllers 1 will 
have to be developed for th<? ; stack jbe^giprpduced^ developers will have to become 
familiar with the assembly language of the target processor, .and will become 

20 dependent upon the development tools provided for that processor. 

• Lack of modularity* coupled- code-is not reused 
v.: means that much the same ]work wiUpjiaye-tpJpe redone for, the next /digital broadcast 

■ t ,- r ,.,- ■ stack to be produce*! .,-, n r ,u r..vyo^c ;•>' K* r. • '..>■■ r.?« • * 



25 Coupled with these difficulties are an associated set of 'strategic 1 problems that arise from 
this type of approach to stack development, in which stacks are inevitably strongly 
attached to a particular hardware environment, namely: 
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o From the stack producer's point of view, there is an uncomfortably close relationship 
with the chosen DSP hardware platform. Not only must this be selected carefully 
sirice ; mistakes will require a costly (and time-consuming) port, but the development 

,. tools, low-level assembly language, test/rack', hardware development and final 
platform ASIC production will all be architecture-specific. If an opportunity to use 
the stack on another hardware platform comes up, it will first have to be ported, 
which will take quite a long time and introduce multiple codebases (and thereby the 
strong risk of platform-specific bugs). The code base is the source code that 
underpins a project. Ideally when developing software you would have a one to one 
mapping between source code and .functionality, -so if a number of projects require a 
particular function they would all share the same implementation. Thus, if that 
implementation is improved all projects will benefit. What tends to happen, howeverr 
is that separate projects have separate copies of the' code and over time the 
implementations diverge (rather like genes' in the natural world). When projects use 
different hardware, under the conventional development paradigm, it is sometimes 
impossible to use the same code. And even if the same hardware platform becomes 
available with an upgraded specification, the code "will still have to undergo a 'mini- 
port' to be able to use those additional features (more on-board memory, for 
example, or a second MAC (Multiply Accumulate) unit). 

<■> From the hardware producer's point ,of view, there is an equally uncomfortably close 
relationship, with the software stacks. Hardware producers do not want (on the 
whole) to become experts in the business of stack production, and yet without such 
stacks (to turn their devices into useful products) they find themselves unable to shift 
units. For the marketplace, the available 'software base' can obscure the other 
features upon which the hardware producer's products ought more properly to 
compete (such as available MTPs, power consumption^ available hardware IP, etc.). 

Operating system providers (such as Symbiari Limited) find it essential to interface 
their OS with baseband communications stacks; in practice this can be very difficult 
to achieve because of the monolithic, power hungry' and real-time requirements of 
conventional stacks. 
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Reference may be made" to eXpressDSP; Reil-Time Software T dciiiTblogy from Texas 
Instruments Incorporated." ' : This suite of products : eriaBles the feducticin' of development 
and integration tfirie'for DSP software. But it exemplifies many' of the' r di'sddvantages of 
5 conventional design approaches sincie it is not a virtual riiachine layer. 

Key concepts in the CVM 

The CVM is software for designings mbdellirig or performing digitaLsignal processing, 
which comprises a virtual machine layer optimised' fori communications DSP. 

10'' ' ' ■ v . -I.-,: ... r;:-,-- ■ 

. A 'virtual machine' typically defines the functionality and,int^rface5 of die ideal machine 
for implementing the type of applications relevant. to the present invention. It typically 
presents to the using application an ideal machine, optimised, for the task in hand, and 
hides the irregularities and deficiencies of the.actual hardware. The 'virtual machine 5 may 

15 also manage and/ or , maintain one or mpre state machines modelling or representing 
communications processes. The Stirtual machine layer' is then software that makes a real 
machine look like this ideal one. This layer will typically be different for every real 
machine type. A 'virtual machine layer' typically refers to a layer of software which 
■* * provides a set of one or moire A^Is '(Application Program' interfaces) "to perform some 

20 task or set of tasks (e.g. digital sighai'prdcessin^) and which also owns the critical 
resources tHat must be allocated and sKared between using programs (e.g. resources such 
as memory and 1 CPU). " - * • ; 

; The virtual, machine Jayer/is^ preferably/,^ to allocate, share and .switch resources 

25 in such a way as is best for digital, signal processing; a topical operating system, in 
contrast, will be optimised for general user-interface programs, such.as word processors. 
Thus, for example, the resource switching, algorithms in this case will typically operate on 
much smaller time increments than that of an end-user operating system and may control 
parallel processes. 
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The virtual machine layer, optimised for a communications DSP, insulates software 
baseband stacks from the hardware upon which they must execute. Hence, baseband 
.stacks can be made very portable since they can. be isolated by the virtual machine layer 
from changes in the underlying hardware. The virtual machine layer may also manage 
flow control between different connected modules (each performing different functions); 
this may be done on a concurrent basis. It may also define common data structures for 
signal processing, as will be described in more detail, subsequendy. 

The CVM may be used in a development environment to enable a communications 
device, (e.g. a baseband stack, or indeed an entire SoC including several baseband stacks- 
from different vendors, or an end product such as a mobile telephone) to be modelled 
and developed or to actually perform baseband processing. 



The potency of applying the 'virtual machine layer' concept to the domain of 
communications DSPs can best be understood through an example from a non-analogous 
field. In the field of PC software, Microsoft's Windows™ operating system (sitting on 
top of the system BIOS) insulates software developers from the actual machine in use, 
and from the specifics of the devices connected to it. It provides, in other words, a 
'virtual machine layer', upon which code, can operate. Because, of this .virtual machine 
layer, it is not necessary for someone writing a word processor, for example, to know 
whether it is a Dell or a Compaq machine that will execute their code, or what sort of 
printer the user has connected (if any). Furthermore, the operating system provides a set 
of common components; functions and services (such as" file dialog panels,' memory 
25 allocation mechanisms, and thread management' APIs). Because only written once, the 
rigour, extent and reliability of such 'conimon code' is greatly increased over what would 
be the case if each application had to re-implement it, over and over again. Further, the 
manufacturers of PC hardware are protected from the complexities of software 
development, having only to provide a BIOS and drivers from the appropriate Windows 
APIs in order to take advantage of the vast array of existing software for that platform. 



30 



BNSDOCID: <WO 0154300A2 J _> 



,r. .p >>: WO 01/54300 



PCT/G BO 1/00280 



30 

This situation can be contrasted with the pre-Windows situation in which each- 
application would frequently contain its own custom GUI code:and,driyers, 

A key enabler for the PC Windows Virtual machine layer' approach' is that a large 
: 5 number of applications require largely the same underlying 'virtual machine 1 functionality. 
If only one application ever needed to use a' printer," or only one needed multithreading, 
then it would not be effective for these services to be part of the Windows Virtual 
machine layer'. But, this is not the case as there are a' large number of applications with 
similar I/O requirements (windows, icons, mice, pointers, printers, disk store, etc.) and 
10 similar 'common code 1 requirements, making the. PC 'virtual machine layer- a compelling 
proposition. 

However, prior to the CVM, 'no-one had considered applying the Virtual machine' 
concept to the field of communications DSPs or basestations; by doing so, the CVM 
15 enables software to be written for the .virtual machine rather than a specific DSP, de- 
coupling engineers from the architecture cpnstraints of DSPs from any one source of 
manufacture. This form of DSP independence is as potentially useful as the hardware 
independence in the PC world delivered by the Microsoft Windows operating system. 

20 "Tfctereare therefore several key advantages to various inipiementitions of the present 
' 1 : - invention: °" •' J ' } '* ( — •- ; 

• : Portipg baseband stacks across DSP architectures, and to. cHfferpnt media access 
, hardware (such as, for e:xarpple, pprting a stack for a.GSM phone .operating at 900 
25,, . MJiz to,. one. operating at;.18ppi^Hz) .will -be much .faster since the CVM enables 
:: stacks., tp .\>£; designed which are npt architecture or, speqtxum specific: a critical 
. advantage as. time; to market becomes ever mpre.j^portant. .Hence, a stack will work 
; . on any DSP architecture ..to. which ,the virtual machine layer , has been ported. 
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Likewise, a DSP to which .the virtual, machine layer has. been ported will run all the 
stacks written for the virtual machine layer. . 

* Much of the high MIPS, complex code (e.g. a Viterbi decoder) will be written once 
only for the virtual machine layer, as opposed to many different times for each DSP 

: ' architecture. Hence, quality and reliability of this complex code can be economically 
improved. That in turn means that the baseband stacks will themselves need less 
code and what stack code there is need be less complex, thus increasing its reliability. 

o The virtual machine layer provides the ability to prototype either entirely in software 
or with a mixture of software and proven DSP components, allowing the 
identification of algorithmic .deficiencies and resource requirements earlier in the 
development cycle. , 



15 



20 



The virtual machine layer is programmed with or enables access to various core 
processes and/ or core structures and/or core functions and/or flow control and/or state 
management. The core processes with which the virtual machine layer is programmed 
(or enables access to) include one or more 'common engines'. These 'common engines' 
perform one or more of the baseband- stack functions, namely: source coding, channel 
coding, modulation and their inverses , (source decoding, channel decoding and 
demodulation),, The 'common engines'; include the fast Fourier transform (FFT), Viterbi 
decoder (with various constraint lengths,-. Galois polynomials and puncturing vectors), 
... Reed-Solomon engines, discrete cosine transform (DCT) for the MPEG decoders, time 
and frequency bitwise re.ordering for. error decoherehce- complex vector multiplication 
and Euler synthesis. A more extensive list /is contained at Appendix 2. One or more of 
these parameterised transforms are commonly required by communications baseband 
25 stacks. This subsidiary feature is predicated on the inventive insight that a set of 
common processes is found within almost; all of she-. key digital broadcast systems; an 
example is the similarity of GSM to DAB: both, for example, use interleaving and Viterbi 
decoding. Commonality is hence.predicated on a common mathematical foundation. 



BNSDOCID: <WO _ 0154300A2J. > 



>VO 01/54300 



PCT/G BO 1/00280 



32 

In. addition, a 'core stmcttire'-mayralso^be present in eack case. The 'core structure' 
involves splitting the decoding chain up into a symbol* processing section (concerned 
with processing full symbols, regardless ofwhether^.d^.m within that 

symbol is to be used) and data directed processing,. in which only. those, bits which hold 
5 relevant information are processed. In . each case, it is highly desirable that the processing 
modules are able to allocate, share and dispose of intermediate, aligned memory buffers, 
pass events between themselves, and exist within a framework that enables modular 
development. 

10 The core function may relate to resource allocation and scheduling, include one or more 
of the following: memory allocation, real time resource allocation arid concurrency 
management 

The software can, p^ superior in 

15 , .. performance, and capability than 1DSP deisign: tools. It may be subject to conformance 
. , scripting, as will be defined -subsequently.. ; In ^ addition-itimy operate with a component, 
in wluch oi^y tiiat inform to operate 1 with arid/ or otherwise 

. model the performance of the .component as supplied by -the owner of the intellectual 
property in tike, component.. Thisieiaablds/^ 
20 , can .,be : vaJtaable . trade- seCTAt-'ipfomatioo^sujchl asLinternal -det^B;- 'design v anid operation)' to 
, J^de.thar inform^ the functions 

jt v5 j supported^ the parameters required the? AEIs,,riming. and resource interactions, and the 
, expected performance iot^h^zctcm^t^l^^^m^uon -.. r- ; . ; : •/ * . 




25 Summary of- the GVM impldmratatioh : " : a 

The CVMls both* a platform for developing digital signal processing products and also a 
runtime for actually running tiiose products. The CVM in essence brings the complexity 
management techniques associated with a virtual machine layer to real-time digital signal 
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processing by(i) placing high MIPS digital signal processing computations (which may be 
implemented in : an architecture specific manner) into 'engines' on one side of the virtual 
machine layer and (ii) placing architecture neutral, low MIPS code (e.g. the Layer 1 code 
defining various low MIPS processes) on the other side. More specifically, the CVM 
separates all high complexity, but low-MIPs control plane and data 'operations and 
parameters' flow functionality from the high-MIPs 'engines' performing resource- 
intensive (e.g., Viterbi decoding, FFT, correlations, etc.). This separation enables 
complex communications baseband stacks to be built in an 'architecture neutral', highly 
portable manner since baseband stacks can be designed to run on the CVM, rather than 
the underlying hardware. The CVM presents a. uniform set of APIs to the high 
complexity, low MIPS control codes of these stacks, allowing high MIPS engines to be 
re-used for many different kinds of stacks (e.g. a Viterbi decoding engine can be used foi?- 
both a GSM and a UMTS stack). " 

During the development stage .of. a digital signal processing product, the MIPS 
requirements of various designs of the, digital signal processing product can be simulated 
or modelled by the CVM in order to identify me arrangement which, gives the optimal 
access, cost (e.g. will perform. with the minimum number of processors); a resource 
allocation process is used which uses at least one stochastic, statistical distribution 
function, as opposed to a deterministic function. Simulations of various DSP chip and 
FPGA implementations are possible; placing high MIPS operations into FPGAs is highly 
desirable because of their speed and ^^ paralleVprbcessing capabilities. 



During actual .operation, a.. scheduler jn the . CVM can intelligently, allocate tasks in real- 
25 time to computational resources in order to maintain optimal operation. This approach 
is referred to as '2 Phase Scheduling' in this specification. Because the resource 
requirements' of different engines can be (i) expkcitly modelled at design time and (ii) 
intelngehuy utilised during runtime, it is possible to mix engines from several different 
vendors in a single prdduct. As noted above, these engines connect up to the Layer 1 
control codes not directly, but instead through the intermediary of the CVM virtual 



20 
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machine layer/ Further/ effkie^ a run 

time using a-DSP and FPGA- combination -arid then- onto a custdtff r ASIG\is- possible 
using the GYM:- < ■ " Wv ■ •'■ ■ •■ drvu? an* -;.-p< ".A. :.i-r\ 

5 The CVM is implemented with three key features: 

• Dynamic, multi-memory-space multiprocessor distributed scheduler with support 
for co-scheduling. -. : . ' , ■ ; ' : , 

• APIs to commoriiy- used; high-MIPs 'Operations for digital broadcast and 
communications, with architectme-nativd implementations/ 

10 • Resource management and normalisation layer (provided over the native RTOS). 

The CVM can exist in several 'pipeline' forms. A /pipeline' is. a : structure. ,pr set of- 
interoperating hardware or software devices and routines which pass information from 
one device or process to another. In the DSP environment, such pieces of information 
are often referred to as 'symbols'.^ Pipelines 'can be implemented also as data flow 
15 architectures as well as conventional p^dcedtirial code and all such variants are within the 
scope of the present invention. TlSe (?VM caki'also be conceptualised and implemented 
as a state machine dt as procedural rbde f ahd kgain aill such variants are within the scope 
of the; present invention. : ' - ^ v " ~ 

20 One instance of the CVM ; contains.^ ^hich incorporates 
run- time versions of the CVM core. By 'interpreted' we mean that its specification has 
not been translated into the underlying machine code, but is repeatedly re-translated as 
the program rutis, in exactly the skiAe* was as-an- interpreted language, such.as BASIC. 




25 Another instance is a^ Instrumented Interpreted Pipeline Manager which incorporates 
run-time versions of the CVM core., TThis operates in the ,same was as an Interpreted 
Pipeline Manager, but also produces nietrics and measurements helpful to the developer. 
:\n interpreted non-instrumented version is also \isefiil for development and debugging, 
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as is a compiled and instrumented version. The, latter may be the optimal tool for 
developing and debugging. 



Another version of the CVM is a Pipeline Builder. Instead of running, it outputs 
computer source code, such as C, which can be compiled to produce a Pipeline 
implementation. For this reason it must have available to it CVM libraries. It can be 
thought of as the compiled and non-instrumented variant. 

The CVM apparatus may include or relate to a standardised description of the 
characteristics (including non-interface behaviour) of communications components to 
enable a simulator to accurately estimate the resource requirements of a system using' 
those components. Time and concurrency restraints may be modelled in the CVM 
apparatus, enabling mapping onto a real time OS, with the possibility of parallel 
processing. 



CVM DETAILED DESCRIPTION 



CVM Overview 

The CVM is both a platform for developing, digital signal processing prpdiicts and also a 
runtime for actually running those pxodiicts. The r CVM in essence brings the complexity 
management techniques associated witli a .yirtual, machine layer to real-time digital signal 
processing by (i) placing high MIPS digital signal processing computations (which may be 
implemented in an architecture specific manner) into 'engines' on one side of the virtual 
machine layer and (ii) placing architecture neutral, low MIPS code (e.g. the Layer 1 code 
defining various low MIPS processes) 'on the Other side. Mote specifically, the CVM 
separates all high complexity, but low-MIPs control plane and data 'operations and 
parameters 5 flow functionality from the ingh-MIPs 'engines' performing resource- 
intensive (e.g., Viterbi decoding, FFT, correlations, etc.). ; This separation enables 



WO (11/54300 



: PCT/G BO 1/00280 



36 

complex communications 'baseband S<tadcs to fce'biiUtin an '^chitectiire neutral', highly 
portable manner since baseband stacks can be designed to fun on the XTVM,' rather than 
the underlying hardware. The CVM presents a uniform set of APIs to the high 
complexity, low MIPS control codes of these stacks, allowing high MIPS engines to be 
re-used for many different kinds of stacks (e.g. a Viterbi decoding engine can be used for 
both a GSM and a UMTS stack). 

The virtual machine layer supports underlying high MIPs algorithms common to a 
number of different baseband processing algorithms, and makes these accessible to high 
level, architecture neutral, potentially high complexity but low- MIPs control flows 
through a scheduler interface, which allows the control flow to specify the algorithm to 
be executed, together with a set of resource constraint envelopes, relating to one or more, 
of: time of execution, memory, interconnect bandwidth, inside of which the caller desires 
the execution to take place. 

During the development stage of a digital signal processing product, the MIPS 
requirements of various designs of the digital signal processing product can be simulated 
or modelled by the CVM in order to identify the arrangement which gives the optimal 
access cost (e.g. will perform with the minimum number of processors); a resource 
allocation process is used for modelling which uses at least one stochastic, statistical 
dis^tfibutitfn function as opposed to a 

deterministic function. Simuiatidiis oF various DSf* chip'and FPGA implementations aire 
possible- placing high MIPS bperatiohs I^GAs is higfily (desirable because of their 
speed and paraHerprocessin^ Xmi l '* tJi - -■ '■■'i 

During actual operation, a scheduler in. the, CVM can i^tel%entiy.^ocate tasks in real- 
time .to computational ^ res ; ourc^es in corden.to maintain optional operatipn. This approach 
is referred to as c 2 Phase ScheduUng\in. this, specification.. Because the resoiirce 
requirements of different engines, car) be-;(i) explicitly modelled at design time and (ii) 
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intelligently utilised during runtime, it is possible to mix engines from several different 
vendors in a single product. As noted above, these engines connect up to the Layer 1 
control cpdqs not directly, but instead through the intermediary of the CVM virtual 
machine layer. Further, efficient migration from the PCT non-real time prototype to a 
run time using a DSP and FPGA combination and then onto a custom ASIC is possible. 

The CVM is implemented with three key features: 

o Dynamic, multi-memory-space, multiprocessor distributed scheduler with support 
for co-scheduling. 

© APIs to commonly used, high-MIPs operations- for digital broadcast and 
communications, with .architecture-native implementations . ~ 

o Resource management and normalisation layer (provided over the native RTOS). 



The CVM is a design flow solution as well as a runtime 

The CVM provides a complete design flow to complement the runtime. This provides 
the engineer with fully integrated mathematical models, statistical simulation tools 
(essential for operation with bursty data) > a priori:-par.titioning simulation tools (to 
determine e.g., whether a datapath should go into hardware or run in software on a DSP 
core). Through the use of custom libraries for mathematical modelling tools (e.g. Matlab 
/ Simulink), the CVM is able" to model in detail and with bit-exact accuracy the high- 
MIPs engine operations, allowing engineers to determine up front how many bits wide 
the various datapaths must be j etc. However, the system is also able to accept XML 
commands from a statistically simulated control plane, allowing birth/death events and 
burstiness to be handled within the context of the model. Furthermore, since even the 
simulation engines are accessed through the scheduler's indirection interface, it is 
possible to plug in calls to e.g. real hardware implementations to speed simulation 
execution. 
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ft is also, importantly,: possible to perform simulation "off resource loading under various 
system ; partitioning ddBsiohs. How iriariy' instances of a paraculai^algoritHrriic 'engine' 
(e.g., a Viterbi decoder, a RAKE receiver blfe'mfet, a block FFT ?, '6p<itatidiiV etc.) are 
5 required to provide sufficient tbver under various ^tatistical'loadings? What happens if a 
datapath is moved across a latent and/or contended resource such as a bus? What if the 
datapath is implemented in hardware rather than software? All of these decisions are 
critical but existing toolsets have not addressed them, and this is doubly true when the 
partitioning decisions are being made with respect *to multiple, thitd-party IP engines or 
10 engines (see beloiP). The CVM design flow explicitly enables these sorts of design 
decisions to be answered. Furthermore;, initial partitioning information is then 'fed 
forward' from the -design, toolset unto- the 'runtime scheduler, enabling it to vectoF 
requests off to the appropriate engine instances for implementation when the system is 
under actual dynamic load. 

15 

Working from the "bottom up', treating the software largely as an afterthought, is not 
longer a viable route to market; this path simply takes too "long, yields a result that is too 
■ • architecture-specific, ; and -hacs- at bad; 'fk'rtd "the- parallel, state^machine nature of the 
: underlying domain. Working from the ; 'top "down 5 , the paradigm utilised by the CVM, 
20 provides a much moire powerful and extensible-solution. 




A final point about .the CVM is that by separating out the, control flow code from the 
underlying engines, it becomes possible to perform a lot of development, work on 
conventional platforms (e.g., PCs) without having to work with the actual embedded 
25 target. This allows for much faster turnaround of designs than is generally possible when 
using a particular vendor's end target development platform. 
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Example: The CVM is a design solution for hard real time, multi-vendor, multi- 
protocol environments such as SoC for 3G systems L 

One of the core elements of the CVM is its ability to deal with (potentially conflicting) 
resource requirements of third party software/hardware in a hard real time, multi-vendor, 
multi-protocol environment. This ability is a key benefit of the CVM and is of particular 
importance when designing a system on chip (SoC). To understand this, consider the 
problems faced by a would-be provider of a baseband chip for the 3G cellular phone 
market. First, because of the complexity of the layer 1 processing required, simply writing 
code for an off-the-shelf DSP is not an option; an ASIC will be required to handle the 
complexities of dispreading, turbo decoding, etc. Secondly, since UMTS will only be 
rolled out in a small nurnher of metro locations initially, the chip will also need to be able 
to support GSM. It-is unlikely that the company producing the baseband chip will have - 
extensive skills in both these . areas, therefore IP will need to be licensed in. This point 
becomes particularly relevant in light of the ever increasing time-to-market pressures for 
technology, companies. But licensing in., part-hardware; part-software IP engines from 
multiple vendors, for layer 1 provides a. real problem. First, there is.no current common 
simple standard for 'mix and' match' IP in this manner.-.What is needed, and what the 
CVM design flow provides, is a way. to characterise both the static and dynamic resource 
requirements of a 3 rd party IP bloclq soithatit may be co-scheduled in realtime with other 
IP engines, potentially from an. entirely different supplier, and then connected 
transparendy through to the higher level layer 1 control code. Furthermore, the nature of 
the CVM is that these high-level overall call structures and control planes can be 
produced in an architecture-neutral language (e.g., SDL compiled to ANSI Q, with only 
the low-level, high-MEPs parts being implemented "directly in an architecture-specific 
25 form. " ' ' " ' ' ' • ; 



' • As noted above, the high MIPs .functionality contained within the engines represent 
complete operational routines. These engines may be implemented in hardware or 
software or some combination of the two, but this is unimportant from the point of view 
30 of the high level 'calling' code, which is entirely abstracted from the engines. The liigh- 
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level IP communicates with;.the xmderlying :engkms Vi&GVM scheduler : <k^s; which allow 
the hard real-time dynamic resource constraints no be specified;- The' scheduler then 
dispatches the request to the appropriate datapath for execution, which majr involve 
calling a function on a DSP, or passing data to an FPGA or ASIC. Importantly, the 
schedliler can deal with multiple hard datapaths that may have different access and 
execution profiles — for example, an on-bus Viterbi decoder, an on-chip software based 
decoder, and an off-chip dedicated ASIC accessed via external DMA - and pass 
particular requests off to the appropriate unit, which is completely independent from the 
calling high-level code. 

; .This also means that, where two:' different .communications stacks: require some common 
highrMIEs engines, Zi vendor of an appropriate' (platfioixn-spedfic)- engine implementation 
(whether designed ■•in* hardware,". software,;:or' some combination of both) can sell into 
both markets, and, if the- two standards are implemented on a single SoC, both stacks can 

, potentially: share the same accelerator. In addition, the CVM specifies a set of over 100 
core operations which- taken 1 , together . provide ' around 80°/o. of the high-MIPs 

-functionality- found, in i the. :vast; majority of :;digital broadcast and communications 
protocols. Tjhe;<lVM: ra the underlying RTOS, 

i presenting the, ihigltTlevel code with^mormkJi&ed interface for resource management 
.(includingjlireads^ L^i^:^-" : i 

Using the CVM, it is possible to construct an integrated development platform for 
communications SoC products, in which a number of third party vendors are able to 
publish their IP, as either high-level architecture neutral SDL or C++ components, or 
architecture specific, resource profiled engines (which can be hardware, software, or a 
combination of both). An integrated design flow would enable the SoC designer to 
. produce an overall system vthat contains: the appropriate engines (chosen from particular 
vendors); add her own IP on both or either side of , the CVM; and then generate, both the 
deplcyable hardware specification (as a -number of VHDk-de fined* cores,, together with 
accelerators) and software components. It is possible to construct a toolset which would 
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■provide- a complete flow through mathematical modelling, statistical a priori stochastic 
simulation for partitioning, protocol verification and final system generation and provide 
appropriate mechanisms to characterise, publish, enumerate and use libraries of 
'packaged* IP within .designs. ' . ■ . 



This system would have the potential to become the main workbench for SoC designers, 
who would only have to go into VHDL tools to develop the high-MIPs engines, not any 
of the layer 1 control fabric. 



The CVM allows SDL to.be used in designing Layer 1 



10 As noted above, the CVM allows the low-MIPs code to bewritten in an architectural 
neutral manner, using either ANSI C++ or, preferably, SDL which may then be 
compiled to ANSI C. SDL is a language widely used within the telecommunication 
industry for the representation of layer 2 and layer 3 stacks, and is particularly well suited 
to systems that are most economically expressed in a state machine format. SDL 
15 traditionally would not be appropriate for use below layer 2 (the end of the 'soft real 
time' domain) . The SDL code is entirely portable between various architectures, and may 
be tested in the normal manner using tools such as TTCN. System constraints (such as 
dynamic resource ceilings) can be: attached to various portions of the code and substrate 
• - interconnects in development arid -then simulated with realistic loading models to allow 
20 up-front partitioning of the datapaths into hardware and software. Importantly, the CVM 
schedule is- cognisant of the datapath partidriirig decisions taken during the design time 
portion of the development process. The toblflow is fully- integrated with Matlab and 
Simulink, allowing bit-accurate testing of high-MIPs functionality. The use of SDL as the 
preferred language for the high-level4ogic • flows within layer 1 ' is not accidental - SDL 
25 . has been widely used within: layers 2 and 3 of telecommunications stacks such as GSM, 
but has not crossed the chasm into the hard real time domain. With ' the" CVM, by 
contrast, it becomes possible to invoke parallel, hard real time execution from SDL 
control flows, thereby allowing the extremely powerful and natural state machine 
expressiveness of SDL to be used to author the high level layer 1 algorithms. 
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Increasingly,, although low MP?s these, .dgprithins, are diemselws-'extremely^ complex, as 
they must deal, with jspues, such as bursty rate- matching, user .transport: channel birth / 
deatbu events, , ; . handovers, between, multipie-j standards,;. ; anda QoS^bound graceful 
degradation under load, to name but a few. Other languages not designed for real-time 
5 operations (e.g. C++ and Java) can also be used in designing Layer 1, as alternative s to 
SDL. 



Theoretical background to the CVM 

Current digital communications systems are built around a largely common consensus, 
10 which has emerged in . the liast 15 years or sojt about the best way to reliably transmit 
information wirelessly in the face of quite severe channel effects. Two-way systems haveu 
somewhat different channel and modulation requirements from broadcast-oriented 
systems (for example, using CDMA to provide graceful degradation in the face of a 
congested spectral band, and having some 'hard' real time requirements), but overall 
15 much commonality exists. 



For example, in the specific case of jbroad.ca$t (one-way) Systems, decoders and encoders 
, ; ; , may . be seen . a$ simply: pzu^et-'-'prc^ systems 
. ( start with sgurce codmg^suc to reduce bitrate) 

2Q , , fqUowed : by channel jt^dmg ^ adds 
* . :■ structured, .redundancy to ^i^prpy^-theii&bjiij^ ja&the. receiver; to. .extract information 
.... ^ despite signal, (at which point a. number of 
• .subcarriers are modified in some comi)i^atipii;o£ angle, (frequency .or phase); or amplitude 
,:to hold the information. The .reverse, p^^esa.ia.tfeenicarried out in the receiver, yielding 
,25 . (qn.ope leyel) the <^gra^:o/TFig^^ -5^:Hence, aoset.oE common -processing -engines are 
. ; ; : found within alm^stall of ,,the;^keyrdigi^J ; broadcast sysrenixsyr.and a common processing 
. structure; may. also be appUed in each ca$e. ; - . .-.I . c 
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The CVM embodiment; exploits this as. follows: the. common engines, (or functions or 
libraries) include algorithms to perform one or more of . the following: source coding, 
channel coding, modulation, or their inverses, namely source decoding, channel decoding 
and demodulation. They include for example, the fast Fourier transform (FFT), Viterbi 
decoder (with various constraint lengths, Galois polynomials and puncturing vectors), 
Reed-Solomon engines, discrete cosine transform (DCT) for the MPEG decoders, time 
and frequency bitwise re-ordering for error decoherence, complex vector multiplication 
and Euler synthesis, etc. A more extensive list is at Appendix 2. These are high MIPS 
routines and therefore ideally implemented in a CVM in an architecture specific manner 
(either through assembly code or hardware accelerators). They can, regardless of this, be 
accessed in the CVM through common, high level APIs. Each of these parameterised 
transforms has a parallel mathematical modelling block provided for it 



The common structure involves splitting the decoding chain up into a symbol processing 
15 section (concerned with processing full symbols, regardless of whether all the 
information held within that symbol is to be used) and data directed processing, in which 
only those bits which hold relevant information are processed. In each case, it is critical 
that the processing modules are able to allocate, share and dispose of intermediate, 
aligned memory buffers, pass events between themselves, and exist within a framework 
that enables modular development. The common structure is paralleled where 
appropriate in a mathematical modelling environment and described via graph 
description ianguage (GDL). Figure 6 schematically depicts this common block and 
structure approach used in the CVM. 



25 A. similar analysis may be provided for 2-way systems, except that there is. an additional 
CCS (calculus, of concurrent. systems), r equirement. and resource allocation issue, and the 
required 'critical mass' of processing engines is slightly different 



20 
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It is interesting that current 'generation lMrd : " : p^ tools and 

. -hardware deployment platforms (DSPs and ' DSP r Gores) -do not reflect the - structural 
■ realities dis'cussed^above/ and do not (on the w 

tailored towards - communications baseband applications nor the' ; 2 - phase -scheduling 
5 approach (see below). Nor do current' embedded operating systems support these 

operations in any systematic- or. cohefent mariner. ^ 

However, the number of digital communications systems is increasing rapidly, creating a 
demand for rapid time-to-market deployment of baseband stacks. As explained above, a 
10 core innovative approach of the present invention is to exploit the underlying 
commonaUty and requirements of such systems by providing a software-hosted common 
'virtual machine layer' (exemplified by the CVM embodiment) reifying these capabilities . 
and software structure. One key commercial application is as a design solution for hard 
real time, multi-vendor, multiprotocol enykonments such as SoC (as noted above). 

CVM Development Methodologies . 

The development methodology used in, the CVM builds upon (and departs from) a 
methodology using layered development and layered deployment. These concepts will be 
discussed initially: Layered development refers . to .a process of progressing from 

20 mathematical models, through C++ or SDL code to a target assembler implementation 

■■{■ l.<- " "Vf.r:[ir--:,-y:i\ s-i'O inborn. U:^:. ^.r;sr! . ■ or-. ' v.-. 

(if necessary). Throughout this process^ each of the modvdes in question is maintained at 

each of the necessary levels (for example, a convolutional decoder would exist as a 

parallel mathematical model, C++ implementation, SIMD model and assembler 

implementations in various target languages). 

25 ^'iJ^ehd'^'JI^^/intteEtri to the use bf Ebi&rifc's ' to isolate the code' as far as possible from 
the underlying hardware ahd' r^ actually 
implemented. Hence; L as rriuch as possible or* the code (high complexity but low MIPs 
requirement) is kept as generic SDL or ANSI-compliant C++ which is then simply 
recompiled for the target platform. For example, a library is used to provide platform- 

30 dependent functions such as simple I/O, allocation of memory buffers etc. Another 
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library is used to provide high-cycle routines (such as the FFT, Viterbi decoder, etc.) in 
an architecture specific manner, which may involve the use of highly crafted assembler 
routines or even caUthroughs to specialised hardware acceleration engines. 

These two libraries, no matter what the underlying hardware and operating system 
substrate, are manifest as a common API to the 'core' code, which therefore does not 
have to be modified during a port. The only code which does get modified, namely the 
contents of the library implementations, benefits from significant encapsulation and a 
wide variety of test vectors generated from the mathematical models. It is because the 
points of articulation in the architecture are appropriately positioned that porting of 
stacks can be rapidly achieved using this approach. 



- Furthermore, as a development platform, this approach has the great advantage that one 
can develop on one architecture (e.g. the Intel platform) running not a mathematical 
15 model but rather a full, real-time transceiver, and then simply swap the libraries and 
recompile on the target architecture. This is very useful when trying to e.g., tune an 
equaliser module. ~ ! 

The CVM approach . builds on this way : of working. However, in addition, as much as 
20 possible of the common functionality is abstracted into the: Virtual machine 1 hardware 
abstraction layer, together with key services and functions that are. useful- for all digital 
communications baseband processing work. 

Figure 7 below shows how this would work at an architectural 4evel/ Instead of the given 
25 stack being shipped with different library implementations for platform A and platform 
B, in the CVM there is a common 'baseband operating system 1 layer for each of platform 
A and platform B, providing a common API on top of which (apart from a recompile) 
the higher level code can run unchanged. 
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Furthermore, we can incorporate into this layer much of the functionality that otherwise 
would lie within the C++ core, such as the symbol subscriber architecture for symbol- 
directed processing, and the pipeline architecture for data directed processing. 

• 5 ■ ■ ' 1 - 

Specific CVM Development Methodologies: Two ;Phase Scheduling 

Phase I r 

An important aspect when building a Baseband communications system is quantifying. 

the requirements of the hardware and software platform the application will run on. A* 
10 baseline calculation of the number of MIPs, (millions of instructions. per. second) an 

application will.require is relatively straight forward, simply .calculate the requirements of 

each component, to perfprm one operation, multiply by the number of operations and 

add them all together. This, however does np„t take into account aspects like parallelism. 

Although, theoretically, 2 x 500 MIPs processors will deliver 100Q MlPs .of processing 
15 power the algorithms may not be able to take advantage of this if the are waiting for 

operations on another chip to complete. There are also the extra processing requirements 
- : ' J ' of the scheduler 1 arid ffie'data' tr consider.' The data transfer penalty is 

: pibbably small if both pf6cfessdrs r are 6h i&6 : sairie board -but ^more* significant if they are 

oh separate boards plugged Mt6 J ah ext^tfalBus/Bus contention (two or- more processors 
20 wanting to transfer data at the same tiMe) can-also rcdtice bveraU feffieifehcy: : 

The CVM provides a number of methods to facilitate implementing systems in this sort 
, of ^stxibuted enyko0ment.. :r v - v . ^u, (V; - -,. ; - ! - ; ; . ^ ; - 0 ./-.. * ■.-,■.>■;/% 
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Initially we can quantify the requirements of the individual computing components such 
as- the signal processing functions described in Appendix 2 and the more application 
specific engines built upon them. In environments like 3G mobile communications the 
amount of data passing though a block will vary over time so it is" not sufficient just to 
calculate the requirements of a block at one data rate. Instead a profile will be built up 
over the range of potential input vector sizes. 

The CVM allows a system to be defined as a collection of data flows (pipelines) where 
data is injected at one end, and consumed at the other. The engines on these pipelines are 
characterised in terms of how much processing they require as a function of input vector 
size. The first pass at calculating the MIPs usage is to simulate passing engines of varying 
size along this pipeline and calculating the total usage as a function of input block size. 
This calculates the total MIPs requirements of the engines assuming they are run 
sequentially to completion on a single processor. 



A more sophisticated model then assigns engines to separate processors and allows true 
pipelining. A solution based on this architecture will require more MIPs than the single 
threaded solution but has the potential, once the pipeline is loaded, to process data 
engines in shorter elapsed time: I&H iS ! the number of processors, E(N) the efficiency of 
processor utilisation (1 = 100%;'-6-='4e^jv'kp- ! tne'-MIPs rating of a single processor and 
M the total MIPs requirement of the problem then the time to process 1 seconds worth 
of data T will be; 

T = M / (E(N) x N x Mp) 
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The objective is to find th^^nalles.t.yalv^^f N. wher^T-ls^ess than 1 by l / l a,.■;Comfortable , ' 
margin. E(N) will be ^close tp l.foi; a single board and vrill drop as the number, of boards 
is increased (because of tjie v overheads introduced by. scheduling apd data; transfer). E(N) 
will. also, vary, depeading on how the processing -engines are distributed-: between the 
5 boards (because of the varying data transfer requirements and the possibility of uneven 
load balancing leaving an processor idle some of the time), 

A CVM simulator that has knowledge of the scheduling process, the characteristics of the 
bus and the characteristics of the engines will be, able tp ? calculate E(N) and hence T for 
10 different numbers of boards and engine arrangements. It will, also *be possible to 
investigate the effects, of "doubling. up" som^ .of^the. engines; that is having the same 
functionality on more than one board., ; , ....... 

Once we know the sequence of engines that are required' for a task we can set the CVM 
15 to search through arrangements of engines and boards looking for the optimal solution. 
It will also.be possible ^tp have individual Mp values fox the boards (replace N x Mp by 
the sum of the individual Mps).and to ; tiq specific engines to specific boards, for instance 
a Viterbi decoder will always pail on &n FJ?QA, which will have a higher MIPs rating than 
a DSP. For laxge numbers of engines exhaustive. se^che ; s wiU become impractical and 
20 ?pme assistance from an , engineer f wiU.^ , v ,. 



Phase II 

Once we have and acceptable arrangements of engines and boards we can move onto 
phase two of the scheduling process, "doing it for real". Phase I will have generated a 
25 system configuration which can no be used to load the engines onto the correct boards. 
This information will also be made available to the scheduler on the main board. Once 
the system is running data engines will flow from the scheduler to the engines that will 
operate on them. Most of the time this scheduler will simply send data onward in the 
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order they need to be processed but there will be occasions when more intelligence can 
be applied. When there are multiple engines of equivalent priority the scheduler will look 
to try and balance the queue sizes on all the boards by scheduling work to the least 
loaded. When the same functionality exists on more than one board the scheduler will 
again look for the most appropriate board to schedule. All the boards will have a local 
scheduler to obviate the need to involve the main scheduler in routing engines between 
two engines on the same board. When there is a choice of board to send work to 
schedulers will always choose their own board when possible. The scheduler will also 
have to monitor the absolute urgency of the most urgent engines looking for potential 
lulls in the processing, when it can schedule less urgent activities, such as routing log 
messages and monitoring information back to a monitoring console ' 



More CVM Development Methodologies: the MIPS Counter as used in a UMTS 
implementation 

15 As noted above, the CVM consists of a number of distributed engines that are connected 
and controlled by the CVM Scheduler.' These engines may sit on the same hardware, but 
could sit on different hardware (CPU, DSP or FPGA.) For a UMTS implementation of 
the CVM, a system to identify bottlenecks and aid in serialisng the engines/blocks has 
been developed. We first assume that the processing route for a block of data is given; 
for instance the UMTS standards 25.212 dnd 25.222 suggest how the block is muxed in 
the TrCH stage. Some of the processing mayihen be switched between routes depending 
on some objective criteria such as BfJr:' However, the required engines are known. Then, 
the order of the engine must be "determined terms of the data size and number of 
users. For example, if a vector" is of length n, and if the' engine consists of for (int i=0,i 
25 ' <n, i±+) " : : ' - ' : ' : '"- ! : '' '" : " • •"• • -'• - • - 

•: { . • , ■ ^ - .. f.T 'V,r " V . 

; for (int j=0,j <n, j++) ■. ■, - ,-. . . .. ....... 

{ 

//Do something... 
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then we can say that the process is an order n /s 2 5 or o(n*2). Next we can count the 
number of operations (*+', in ( //Do something'). FFTs are for example n Log (n) 
processes. We can then multiply this by the device's instructions per operation and then 
divide this by the number of MIPS to get the time that the device will take to perform a 
task. Alternatively we can simply set a relative time. 

The same process can- be repeated for the number -df users (K): for example MU can go 
as 2 /N K. Finally, each block may -Or may not "ch-dng& the bit* rkte:: Turbo Encoding 
increases it multiplicatively by a factor of 3.m CRC adds 12 bits. (Note, that bus latency,"" 
the scheduler, parallelisation/ serialisation can all be considered to be engines). 

The point is. that we know that d^ta xate. The question answered by pods , process is how 
, can alstxiJhute,the engines (e.g.. their MIPS .budget) to accommodate.this.. 

TopD&wnDesign " " - V- - 

Traversing the processing chain is quite, complex when state and data control are needed. 
This procedure is. used to tie r in RS C++ blocks tiirQugh a standard adaptor to integrate 
with Simulink. Fundamentally, the infe^tiqn is to move through, hierarchies. As you 
move up layers, so the abstraction becomes higher and hig;her. The intention is to round 
trip data a 'user' creates 3, services: The UE Tx tibis^ to the BS through a physical channel 
with certain properties. The BS receives and decodes the data. In this case the BS has a 
trivial backhaul, and retransmits the data back to the UE, through a physical channel, 
whereupon the data is compared to the input data. This system allows us to interchange 
engines to improve performance in terms of BER and time in a. variety of channels. 
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. CVM Features 

.The CVM can be thought of as a. minimal OS to provide the sorts; of functionality 
required by baseband processing stacks (and, as mentioned, these can be two-way stacks 
also, such as GSM or Bluetooth), It is therefore complementary to a full-blown 
5 embedded operating system like Microsoft Windows CE or Symbian's EPOC. 

The CVM provides {inter alia) the following functionality: . : 

o Extensive set of vector-processing primitives (more completely listed at Appendix 
2), covering operations such as FFTs, FIR and IIR and wave digital filters, 
0 decimation, correlation, complex multiplication, etc! These should use hardware' 

acceleration where this is* -available, on the underlying hardware, and would be 
accessed via a set of library calls paralleling an extended version of a library. In a 
sense, this aspect of the CVM represents a software , or API abstraction of an 
idealised digital signal processing engine for digital communications. 

5 o Support for allocation of aligned buffers and memory, 'handshaking 1 (ping-pong 
buffers). 

o Advanced scheduling management, with the option for pre-emptive multithreading 
of a simple kind. Hard real-time performance (i.e., the ability to guarantee that a piece 
of code will execute at a particular -poilit in time) 7 will be supported as a key 
) component -of the architecture 

shared memory) and thread synchronisation facilities will be provided. A key feature 
is a stochastic parallel scheduler- cognisant -of design time pardoning decisions for 
CVM engines across a 'heterogenous' ^ computationai-substrate. ■ * 

o Explicit support for the notion of 'symSol and 1 data directed processing. This will 
) directly support the ability to add symbol subscribers and pipeline stages into the 

.structure to allow modular development. 

©' Support for key I/O peripherals, including serial ports, parallel ports and display 
controllers. 
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• Extensibility to enable the scope of the O/S to be increasedfpAmtuiiriy for modular 
I/O support. 

• ' Characterisation libraries for a particular implementation, -allo\tfing : mathematical 

models and real-time prototypes to ririimicthe performance: of the target "substrate 
and interconnects to a high degree of accuracy. 

• PC versions to enable the production of real-time prototypes. 

• Support for communication with a host (application) OS - this will be bi-directional 
to enable callbacks and so on. A component intercommunication technology (e.g. 

, COM) may be used to. provide j^ie bi^aryjglue'. A suitable application OS might be, 
for example, EPOC32 or Windows CE^. as these are QSs designed to perform the 
more usual usqr-level I/O and st^qtured storage m^ ; 

• Ability 1 tb 'pare down! the ROM image of the CVM at build time to ensure that the 
minimum- ROM ^-(hence, ^ultimately, -chip area) ds i used. - This uses a minimal 
implementation of the CVM.: ;v ''s . ; 

• State machine functionality management (including potential integration with SDL) 

• Support for data structures " v "" " 

• Transforms between different representations (such as fixed and floating point). 

The.gcpgl of ( th£ CVM is to enable jth^^apid deployment of particular applications onto 
particulars targets,., yAth&it- i^ultiplici^^of applications coming. ,at the development stage. 
-Conventional; OS$ design^ of apps that are 

essentially unknQw:n : .when the u GS is loaded, -but this is typically not, the, case with the 
CVM. Moreover the: Q^^jdoes^ot, need.tp^handle interaction .with a user, except by 
supporting presentation streams through portals provided by the 'host 1 OS.. 

The CVM incorporates a number df the f^attires titaVare curr^ntiy in the high-level C+4- 
code of a^ DAB stack into. the -infrastructure,, -level (such as the appropriate modular 
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structiire for the development of symbol-directed and data-directed processing), and is 
not simply a 'library wrapper'. 

, The CVM concept rests upon the idea (critically dependent upon domain knowledge that 
5 can only be achieved through review of the various standards and the process of actually 
building the stacks) that abstracting the common functions and (importantly) processing 
structures required by modern digital broadcast and communications standards is 
possible and can be achieved elegandy through an appropriate software abstraction layer 
coupled with a systematic.layered development environment. : 

10 CVM Advantages ■ 

With the CVM, stack developers are. isolated from the particular hardware in use. The 
CVM, provides support for the structures (e.g., symbol and data-directed pipelines, and 
state machines), functions (e.g., memory allocation, and real time resource and 
concurrency management) and libraries (e.g., for FFT, Viterbi, convolution, etc.) required 

15 by digital communication baseband stacks: to, enable code to be written once, in a high- 
level language (SDL, ANSI C/C:h+:or;Java) and merely recompiled (if necessary, with 
Java it would not be, and COM or some other form of component intercommunication 
technology can provide the 'binary level' glije to link the modules together) to run on a 
particular platform, making calls through to the hardware abstraction layer provided by 

20 the CVM layer. 

Prototyping using the CVM will be very rapid, with each of the DSP modules paralleled 
by a mathematical model. Memory allocation and partitioning will be supported by an 
automated toolset (parametetised by the desired target hardware) rather than relying on 
25 guesswork. Once -the processing chain is established on the model (which will optionally 
be performed by graphical arrangement and parameterisation rather than coding) and is 
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worldng successfully, it will be possible to run , a red-lime PC-based version (using the 
Intel MMX/SIMD version of the CVM, together with RadioScape's generic baseband 
processor module). Any changes to die standard code (e.g. a custom equaliser) may then 
be integrated in a modular, incremental fashion and the code-test-edit cycle (being PC 
5 based) could use all the latest PC development tools, arid be very rapid. Use of hardware 
acceleration on the target platform will be - covered by the CVM (since all of the required 
cycle-intensive features for digital communications baseband processing will be provided 
as library calls at the CVM API). . Clearly, -the' use of an appropriately adapted underlying 
hardware unit, ..would/ provide. targeted acceleration for most of-the desired functions. For 
10 many applications, the support of lightweight pre-emptive multithreading arid other low- 
level functions on the CVM itself will obviate the need to use any other RTOS, but 
interaction with a user-OS (such as Windows CE or Symbian's EPOC) will be supported- 
and straightforward through the APIs discussed above. 



■ 1.5. With this, approach, a CVM-compatible stack; once written, would be portable instandy 
to anyr.of the hardware platforms onto which ■ the CVM itself had Been ported, (always 
providing, of course, that there were sufficient resources (MIPs, memory, bandwidth) on 
. : the target machine to execute the desired 'Stack in real time) without involving extra work. 

.-. This; would represent a substantial "market -opportaaniiy (assurinting reasonable cross- 
,20 , ;,platform„p<bnetration J -of the CVM) for: stack^eriddts^as it will essentially insulate their 
. developments £romT:hafdware'-spe:cificity:^c .There is ; also : "a ''particularly significant 
■ : commercial opportunityrfor designinglmuitiwendor SoG T products {see above). : 

From the hardware vendor's point of view, the advantage of the CVM is that once it is 
25 ported for a given processor, that processor would automatically support (resources 
• permitting) -aLLstacks 1 that had been- -written- to AeCVM..API. :3Fhis, ;of course, obviates 
-.the need for 1^ hardware providers getJin to: need only 
port, the CVM. It also means that the .need to produce- and siipfiort a : full-specification 
development environment .arid toolset is reduced, since stack veridors (for' the digital 
,30 communications market at least) would then be able : to: develop Code purely in ANSI 
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C/C++ or Java. It should be noted that the CVM concept does not apply to all digital 
signal processing tasks, fpr example, making a PID controller for use in a car braking 
system. The reason that the CVM concept works for digital communication baseband 
processing is that, as. explained above, there is a large pool of commonality in such 
5 systems that can be exploited; however, the CVM does not provide all the tools, 
structures or functions that would be required for other digital signal processing tasks, 
necessarily. Of course, it would potentially be possible to identify other such 'islands 1 of 
common function and extend the CVM idiom to cover their needs, but we are focussed 
here on the baseband aspects because they are highly in demand, and strongly exhibit the 
10 necessary 'commonality. *The CVM approach leaves the hardware vendor free to 
compete not on the existing application set, but rather on the virtues of their hardware 
(e.g., MIPs, targeted acceleration, memory, power consumption). 

The CVM Development Cycle 

The process of actually using the CVM to develop a baseband stack will now be 
described. For the purposes of this specification, a device is the target being developed, 
such as a digital radio. A component is an identifiable specific part of it: either software, 
hardware, or both. Interpreted 1 means code (possibly compiled) which reads in 
configurations at run time. 

The CVM Development Cycle begins with the 'Component Definition Language'. This 
language enables the full externally visible attributes of a component to be specified, as 
well as its behaviour. The intention is that this can be written by a manufacturer or (as 
will be seen later) could be generated by test runs of an instrumented CVM. 

Via a" set of plug-ins the Compoheht 'Definition Language can be read in to a 
mathematical modelling tool; such as" ''the -"industry popular MatLab or Mathematica. 
Using the modelling tool, the theoretical behaviour of all components to be used in the 
device would be explored and understood! 



.20 
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- The results of this investigation would then be either inscribed; br Output viia another 
plug j in : t6 f Be developed; into ^Device DeiEihitidn Language': 1 just a'l' Component 
Definition Language defines a component, this' : defihes the target device being built, and 
5 will contain &uch elements as which components are used. ' ' * 

. In effect, the Device Definitipn Language defines the communications /Pipeline 1 that is 
. being developed. The Pipeline concept is important since mpst communications devices 
can be thought of as ,the process qf, moving information through a pipeline, performing 
10 transforms on the way. It is in effect an.electrooic assembly line, but; rather than operate 
on parts of a car, k operates qri items -p^.data commonly called.' symbols'. Thus a radio- 
signal would eventually be transformed to an audio signal. Of course, 'real' devices are" 
often more complicated than a simple pipeline, and may have more than one pipeline, 
branches, or loops. The CVM development process allows 1 a jjipeliiie design to be tested 
15 before a full hardware version is ever built. This -leads to : shorter development- crimes. 

To fully define a target device, or pipeline, more information is needed. We also need a 
description of the resources (such as CPU rate) available on our target, and this is defined 
in a 'Conformance Scripting Language' and interconnects. We also need to know how 
20 each component is used (both physical and software APIs); this is achieved using 
• :;J y 'Component,AM . , . . . / ; 

These three resources: the Device Definition Language, the Conformance Scripting 
Language, and the Component API Specifications, are now used within one of several 
25 possible CVMs: The first is the Instrumented Interpreted 5 (or, preferably, Instrumented 
; and Compiled, which : will perfc^rpiv L 5Q.ore rapictty tha^ 
L ; version) Pipeline Manager. Thi^, has .some similarity to a- software ICE. It reads the three 
resources and, then emulates the pipeline (emulation may be : in: real time): so if the target 
is a radio it then runs as a radio. Because of the Cpnformapce .Scripting Language it is 
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able to simulate any bottlenecks or resource limitations that would exist on the target 
device and is useful for development and debugging. In addition to running, the 
Instrumented Interpreted/ or Instrumented Compiled Pipeline Manager, also outputs 
diagnostic information for eath device - in Component Definition Language. This is 
5 important, since it can now be fed back into the development cycle and merged with the 
original Component Definition Language descriptions to refine that description. Hence, 
information on actual performance is made available to the designer before any hardware 
is constructed, and this is where the (substantial) development savings are made. This 
closes the inner loop of the development cycle. The Instrumented Interpreted or 

10 Instrumented Compiled Pipeline Manager incorporates run-time versions of the CVM 
core. It is possible for software elements of the Instrumented Interpreted or 
Instrumented Compiled Pipeline Manager to be replaced by hardware versions. (Ideally 
one at a time, so that bugs can be detected as they are introduced.) This is another 
development process enhancement. This corresponds to the 2 Phase Scheduling process 

15 (see above) involving the design time portioning of engines across the computational 
substrate. . - \ . ,\>i 

The second CVM is an Interpreted Pipeline Manager'. Tt is "not instrumented, but in 
other regards is identical, it may be used in development and debugging and by a 
20 manufacturer to produce a complete product This is the third benefit: much of the 
work in writing a communications device is already done, It also incorporates run-time 
versions of the CVM core. 

The third CVM is a 'Pipeline Builder'. It can be thought of as a Compiled Non- 
25 Instrumented variant. Like the other two it reads the three resources, but instead of 
running it outputs computer source code, J such as Cywhich cari : be coiiipiled to produce a 
Pipeline implementation. For this reason it must have ^available to it CVM libraries. 
Testing this closes the outer loop of the development cycle. The overall approach of the 
CVM development cycle is shown schematically at Figures 8 and 9. 

30 
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Appendix? . . :r . v ,- u .- : ,? iV , ir -,v,;,:,. ?s . 

Examples of Core. Processes ; \ > 

Signal Transforms and Frequency Domain Analysis 
5 • Signal Flow Graphs (SF 



• Discrete Fxequency DFT 

• Windowing (Hamming, Hanning etc.) 



10 • Digital FIR Filters - ... . u 

■ Irripiilse Respbnse \-"-;.ir.i--'y. — • " ; - ' " : 

• Frequency Response 

• FIR Low Pass Digital Filter v . _ . 

• Infinite Impulse Response Digital Filters ' 

15 

Adaptive Signal Processing „ 

• Components for Adaptive Signal Processing including Adaptive Digital Filters 

' - GhannelJdentification^-;'-; .j^.n rrr\.c\:;..j'j ^j^,:,.; - . :: ••••v 

r • Echo Cancellation ; ' " ' *' ! « •' •• '• - : 

20 • Acoustic Echo Cancellation 

• Background Noise Suppression 

• Channel Equalisation 

Adaptive T ,i,ne ^Enhancement (^LL,E).f,, .. ...... ... . ; , , - x 

t : Adaptive Algorithms, including: — ^ ■ - - • " " 1:; 
25 • Minimising the^Mean Sqtiared Error 

• Adaptive Algorithm for FIR Filter 
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• Mean Squared Error 

• Minimum Mean Squared Error Solution / 

• Wiener-Hop f Solution 

• Gradient Techniques 1 
5 • Gradient Techniques 2 

• The LMS Algorithm 

• Recursive Least Squares 

• Adaptive IIR Filtering 

• Gradient IIR Filtering Techniques 
10 • Feintuch's IIR LMS 

• Equation Error LMS Algorithm 

• Directed Mode (DDM) 

• Subband Adaptive Filter (SAF) Structure 



15 Multirate Signal Processing 

• Upsampling & D owns amp ling 

• Interpolating Low Pass Filter 

• Oversampling and Reconstruction 

• Sigrria-Delta Processing ArcWteciur e : ; r * 
20 • Subband Processing 

• M-Channel Filter Banks by Iteration 
© Modulated Filter Banks 

• Polyphase Filter Banks 

• QMF Filter Banks 
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10 



15 



20 



Audio Signal Source Coding ; v^\-:\ 

Lossless Huffman Coding/Decoding , .. 

Linear PCM 

Companding 

Adaptive Quantization Tools 
Linear Predictive Coding 
Long-Term Prediction 
Delta Modulation (DM) 

Differential PCM PPCM) ■ ' ■* ~ 

Adaptive DPCM (ADPCM) 
LPC Vocoder 

Code-Excited Linear Prediction (CELP) 
Algebraic CELP (ACELP) 
Subband Coding 

Tools for Psychoacoustics vr^r : .: $y.< ; 

Spectral Masking v „; -: .: . : 

Temporal Masking i70i , :)a .r. :: j j-^.: 7 t ; ... ^- ■ - 

Precision Adaptive Subband Coding ; and, jbit .^ocatiqn and ;bit Stream Formatting 
tools 



25 



Digital Modulation 

• XOR long an short code spreading/ despreading 

• Amplitude Modulation 

• Quadrature Amplitude Modulation (QAM) 
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® Quadrature Demodulation •■* 
° Complex Quadrature Modulation 
° Complex Quadrature Demodulation 
o QPSK 
5 o n-PSK 

o M-ary Amplitude Shift Keying 
o n/n QPSK 

o Unipolar RZ and NRZ Signalling 

o Polar and Bipolar RZ and NRZ Signalling 

10 o Bandpass Shift Keying, .including 

o Amplitude (On-Off) Shift Keying 
o Binary Phase Shift Keying (BPSK) 
o Frequency Shift Keying including 
o Bandpass Filtering for BPSK 

15 o Puis e Shaping including 

o Nyquist (Sine) Pulse Shaping 
o Raised Cosine Pulse Shaping 
o Root Raised Cosine Pulse Shaping 

20 Spread Spectrum Tools 

° Pseudo Random Code Generation 

© Gold Sequences 

o Kasami Sequences 

o Orthogonal Spreading Codes 



BNSDOCtD: <WO 0154300A2 I > 



WO 01/54300 



PCT/G B0 1/00280 



62 



Variable Length OC Generation 

• Orthogonal Walsh codes 

• Code Detection : . . . 

• Rake Receiver implementing 

• NBI Rejection Techniques including 

• Prediction filters 

• NBI rejection in Transform Domain 

• Decision feedback NBI rejection 



•.it,- 



10 Tools for Management of Multiple Access & Detection 
• TDMA including 

• TDMA Frames 

• TDMA combined with FDMA 
CDMA including 

15 • Direct Sequence (DS) CDMA 
Power Control 
Beamforming Tools 
Frequency Hopping CDMA 
Multiuser Detection (MUD) 
20 • Multiple Access Interference Suppression 

Decorrelator L v = . . v 

Interference canceller 
Adaptive MMSE 

MMSE receiver training . j . 
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• Adaptive MMSE receiver DDM 



10 



Mobile Channels 

• Rayleigh Fading Suppression mechanisms (Gaussian, Riceian) 

• Modelling and suppression tools, including: 

• Time spreading 

• Time spreading: coherence bandwidth 

• Time spreading: flat fading 

• Time spreading: Freq selective fading 

• Time variant behaviour of the channel 

• Doppler effect 



15 



20 



Channel Coding 
Cyclic Coder 

Reed Solomon Encoder 
Convolutional Encoder 

CE Puncturing i . , ■ i-l.' J \:nz* 

Interleaving f - -v*. \-. .:;ja, v 

Convolutional Decoder ' ^ - ~ 

Viterbi Decoder (Hard and soft decision) 
Turbo Codes 
Turbo EnCoding 
Turbo DeCoding 



25 Equalisation 
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• Adaptive Channel Equalisation 

• FIR Equaliser 

• Decision Feedback Equaliser 

• Direct conversion toolkit * 

5 • QAM Analog RF/IF Architecture ^ ' ' - ' 

• QAM IF Downconversion support 

• Bandpass Sigma Delta support 

• Bandpass Sigma Delta to Baseband support' 

• Bandpass and fs/4 Systems ; : ' 
10 -Tr.:- v: 



Signal Processing Library Functions 

This section describes some of the signal processing functions available Nyith the: CVM 



Vector Manipulation Functions 



1 5 AutoCorrelate 

Conjugate (vector) 

Conjugate (value) 
20 ExtendedConjugate 

Exp 



Estimates a normal, biased or unbiased • auto-correlation of an 
input vector and stores the result in a second vector 

Computes the complex conjugate of a vector, the result can be 
returned in place or in a second vector. 

Returns the conjugate of a complex value. 

Computes the conjugate-symmetric extension of a vector in- 
place or in a new vector. 

Computes a vector where each element is e to the power of the 
corresponding element in the input vector. The result can be 
returned in place or in a second vector. >; : v * 
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InverseThreshold 



Threshold 



CrossCorrelate 



DotProduct 



1 0 ExtendedDotProd 



DownSample 



15 



Max, 
Mean 
Min .. . 
UpSample 

PbwerSpectrum (1) 

PowerSpectrum (2) 



Add 
25 Subtract 
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Computes the' inverse of the elements of a vector, with a 
threshold value. The result can be returned in place or in a 
second vector. . . •* 

Performs the threshold operation on a vector. The result can be 
returned in place or in a second vector. 

Estimates the cross-correlation of two vectors and stores the 
result in a third vector.. . 

Computes a dot product of two vectors after applying the 
ExtendedConjucate.operation to them. 

Computes a dot product of two conjugate-symmetric extended 
vectors: - - 

Down-samples a signal, conceptually decreasing its sampling rate 
by an integer factor. Returns the result in a second vector. 

Returns the maximum value in a vector. 

Computes the mean (average) of the elements in a vector. 

Returns the : minimum value in a vector. 

' ' Up-samples a sigiial, conceptually increasing its sampling rate by 
, r an. integer; facte*. Returns the result in a. second, vector. 

- Returns the power spectrum of a complex vector in a second 
vector; : ' ''' 

Computes the power spectrum of a complex vector whose real 
and imaginary components are two vectors. Stores the results in 
a third vector;- - ' r ' 

Adds two vectors and stores the result in a third. 

Subtracts one vector from another and stores the result in a 
third. 



Multiply 



Multiplies two vectors and stores the result in a third. 
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Divide , Divides, one .vector, by, another and stores , the result in a ; third. 

Complex Vector Operations ! ; 

ImaginaryPart . ( Returns the imaginary, part of a complex vector ir> : a second 
5 vector. 

RealPart , Returns the real part of a complex vector in a second vector. 

Magnitude (1) Computes the magnitudes of elements of a complex vector and 

store? the result in a second vector. 

Magnitude (2) This, second -version calculates the magnitudes of elements of the 

10 . complex vector -whpse, real and imaginary components are 

specified in individual real yectors and stores the result in a thirch 
vector. 

Phase (1) Returns the phase angles of ( elements of a complex vector in a 

second vector. 

15 Phase (2) Computes the phase angles of elements of the complex input 

vector whose real and imaginary components are specified in 
: real-and The function stores the 

.. .. ;;res ^^&P^5pjLngles;in.a^ \;/ : : > 

ComplexToPolar - : Converts 1 " tihe complex real/imaginary (Cartesian coordinate 
.20,-. _ : . s X/Y) pairs ojf^inc^vidual.inpu^v^ to polar coordinate form. 

One version stores the niagnitude (radius) component of each 
element in one vector and the phase (angle) component of each 
element in another vector. 

ComplexToPolar A second version returns, .jthe polar co-ordinates as (magnitude, 

25 phase) pairs in a single vector 

PolarToComplex Converts the polar form (magnitude, phase) pairs stored in a 

vector into a complex vector. Returned in a second vector. 
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5 PolarToComplex 
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Converts the pola* form- magnitude/phase pairs stored in the 
individual vectors into a complex vector. The fiincrion stores the 
real component of the result in a third vector and the imaginary 
component in a fourth vector. 

Converts the polar form magnitude/phase pairs stored in two 
individual vectors into a complex vector. The function stores the 
real component of the result in a third vector and the imaginary 
component in a fourth vector. 



10 



20 



Sample quantisation 

These methods convert between linear and nonlinear quantisation schemes. The number 
of bits used and the non linear parameters used can be varied. 

ALawToIinear Converts a vector of. A-law encoded samples to linear samples. 

The result can be returned in place or in a second vector. 

Encodes a vector of linear samples using the A-law format. The 
result can be returned in place or in a second vector. 

Encodes th^ linear samples in a vector using the U-law . The 
result can be returned in place or in a second vector. 

Converts a vector of 8-bit Jl-law encoded, samples to , the linear 
• format; The, result can, be returned in place orin-a second vector. 



15 LinearToALaw 



LinearToMuLaw 



MuLawToLinear 



Sample-Generating Functions 

RandomGaussian Computes a vector of pseudo-random samples with a Gaussian 

distribution. 



InitialiseTone 



NextTone 



Initialises a sinusoid generator with a given frequency, phase and 
magnitude. 

Produces the next sample of a sinusoid of frequency, phase and 
magnitude specified using InitialiseTone. 
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InitialiseTriangle . s 



NextTriangle ; 
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JipitiaUsesla-.^ with:.*a .given frequency, phase 

and magnitude. . _* .- 

Produces the next' sample of a triangle wave generated using the 
parameters in InitialiseTriangib. 



10 



15 



Windowing Functions 
BardettWindow " 

BlackmanWindow 



HammingWindow 



Hann Window 



KaiserWndow 



Multiplies -a vector by a Bardett windowing function. The result 
is retorned ih a second vector. 

Multiplies a vector by a Blackman windowing function with a 
user-specified parameter. The result" is returned ' iff a second 
vector! ' - •'- -'• 5<; "' ' ; : "~ ■ " ' 

Multiplies a vector by a Hamming windowing function. The' 
result is returned in a stecond vector. i; : 1 - 

Multiplies a vector by a Hann windowing function. The result is 
r eturried in i second vector L .... 

Multiplies a vector by a Kaiser windowing function. The result is 
returned in a secoricl vector.' 



20 



25 



Convolution Functions 

Convolve - -< : ; 

Convolve2D 



Filter2D 



' Perfoi^s - finite', lihdiai: coltivoitition of two sequences. 

Performs finite, linear convolution of two two-dimensional 
signals. r t i:*sv zh • -slq-j ■ : 

Filters i two-dimensional signal similar to Convolve2D, but with 
the input and output arrays of the same size. 



Fourier Transform Functions 

Versions of these methods exist for a number of different data storage (fixed, floating 
and integer) formats:" ; *■»''• - • 
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DiscreteFT 

InitialiseGoertz 
ResetGoertz ' 
5 GoertzFT (1) 

GoertzFT (2) 

FFT(l) 

FFT (2) 

FFT (3) 

15 FFT (4) 

FFT (5) ' 



10 



20 ' IFFT (l) 
IFFT (2) 
IFFT (3) 

25 
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Computes a, discrete-Fourier transform in-place or in a second 
vector. • . ; ■ 

Initialises the data used by Goertzel functions. 

Resets the internal delay line used by the Goertzel functions. 

Computes the DFT for a given frequency for a single signal 
' count.' 

Computes the DFT for a given frequency for a block of 
successive signal counts. 

Computes a complex Fast Fourier Transform of a vector, either 
in-place or in a new vector. 

Computes a forward Fast Fourier Transform of two conjugate- 
symmetric signals, either in-place or in a new vector. 

Computes a forward Fast Fourier Transform of a conjugate- 
symmetric signal, either iri-place or in a new vector. 

Computes a Fast Fourier Transform of a complex vector and 
returns the'result in two separate (real and imaginary) vectors. 

Computes a Fast Fourier Transform of a complex vector 
provided as two separate '(real and imaginary) vectors returns the 
result in two separate (real arid imaginary) vectors. 

Computes an inverse Fast Fourier Transform of a vector, either 
in-place :'or "in a" new vector. 

Computes an inverse Fast Fourier Transform of two conjugate- 
symmetric signals, either in-place or in a new vector. 

Computes an inverse Fast Fourier Transform of a conjugate- 
symmetric signal, either in-place or in a new vector. 
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Finite Impulse Response FUtet Fuiictiohs 



10 



15 



InirialiseFIR 

FIR. : 

BlockFIR 

GetFIRDelays 

GetFIRTaps 

SetFIRDelays ^ 

SetFIRTaps 

ImrisHseMidtiFIR 
MultiFIR 



20 BlockMultiFIR 



Initialises a low-level, single-rate finite impulse response filter 
with a set of delay line ,yalues and taps. - ; . 

.Filters a « single sample through a low-level, - finite impulse 
response filter, previously configured using InitialiseFIR. 

Filters a block of samples through a low-level, finite impulse 
response filter. 

Gets the, delay line values for a low-level, finite impulse response 
filter. 

Gets , the tap coefficients, for a : low-level, finite impulse response 
filter. 

Changes the delay line, valpes for a low-level, finite impulse 
response filter. 

Changes the tap, coefficients for a low-level, finite impulse 
response filter. 

Iratialises a low-lpvel K qiulti-rate finite impulse response filter. 

Filters a single sampje through a low-level, multi-rate finite 
impulse response^ .filter, ^previously configured using 
. InitisliseMiJt^R, . , ,: r _,„. 

Filters a block of r sam£>les through a low-level, multi-rate finite 
impulse response .filter^ previously configured using 
InitisliseMultiFIR. 



Least Mean Squares Adaptation Filter Functions 

25 InitialiseSALF * Initialise a lSw4evel, single-rate, adaptive FIR filter that uses the 

least mean ^uarks ; ^ 

InitialiseMALF Initialise a low-level, multi-rate, adaptive FIR filter that uses the 

least mean squares (LMS) algorithm. 
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15 



20 



25 



InitALFDelay 



SALF- 



5. MALF 



SLF 



10 MLF 



EnginesALF 



BlockMALF 



EnginesLF 



BlockMLF 



SetALFDelays 



SetALFLeaks 



Initialises a delay line- for a low-level, adaptive FIR filter that uses 
the least mean squares(LMS) algorithm. 

Filter a sample through a low-level, single-rate, adaptive FIR 
filter that uses the least mean squares (LMS) algorithm. 

Filter a sample through a low-level, multi-rate, adaptive FIR filter 
that uses the least mean 'squares (LMS) algorithm. 

Filter a sample through a low-level, single-rate, adaptive FIR 
filter that uses the least mean squares (LMS) algorithm, but 
. without adapting the filter for a secondary signal. 

Filter a sample through a low-level, multi-rate, adaptive FIR filter 
that .uses the least mean squares (LMS) algorithm, but withouf 
adapting the filter for a secondary signal. 

Filter a block of samples through a low-level, single-rate, 
adaptive FIR filter that uses the. least mean squares (LMS) 
algorithm. 

Filter a block of samples through a low-level, multi-rate, adaptive 
FIR. filter that uses the least mean squares (LMS) algorithm. 

Filter a block of samples 'through a low-level, single-rate, 
adaptive :FIR.:filter; that uses the least mean squares (LMS). 
algorithm, but without adapting the filter for a secondary signal. 

Filter a block ofisamples through a low-level, multi-rate^ adaptive 
FIR filter that uses the: least mean squares (LMS) algorithm, but 
without adapting the, filter for a secondary signal. 

Sets the delay line vklues fcra low-level, adaptive FIR filter that 
uses the least mean squares (LMS) algorithm, 

Sets the leak values for a low-level, adaptive FIR Site* that uses 
the least.ipean. squares (LMS) algorithm. 
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SetALFSteps . 

SetALFTaps. 

GetALFDelays 

GetALFLeaks 

GetALFSteps ' 

GetALFTaps 



Sets the. step: jvaj-ues £qr : a low-level, adaptive FIR -filter that uses 
he Least mean squares .(I^S) : ,algoridim. 

Sets the taps coefficients for a low-level, adaptive £IR filter that 
,uses the least mean squares (IMS) algorithm. 

Gets the delay line values for a low-level, adaptive- FIR filter ~that 
uses the least mean squares (LMS) algorithm. 

Gets, the leak values for a low-level, adaptive FIR filter that uses 
the least mean squ^es V (^MS) -algorithm. 

Gets the step values for a low-level, adaptive FIR filter that uses 
he least mean squ^pes s (LMS). algorithm. ; • 

Gets the taps cbeffiSieh1:s : for a low-level, adaptive FIR filter that" 
uses the least mean squares '(ILMS) algorithm. 



Infinite Impulse Response Filter Functions; 



InitialisellR 
JnitialiseBiquadUR 
; ^InitialiseJIRDelaj? 

IIR .i : , ci 

BlockllR" 1 r 

Wavelet Functions 

DecomposeWavelet 



Initialises a low-level; ... infinite, impulse response filter of a 
specified order. ; _ 

Initialises a: low-level, - infinite^ impulse response (IIR) filter to 
reference a cascade of ^ biquads (second-order IIR sections). 

Initialises the;dela^Iihe. for: a low-level, infinite impulse response 



. Filters ;a-i single) sample ^through a low-level^ ihfinitef impulse 
^response- filters s vi: -iSsui .-.US 

Pilteirs a*bl6'(*k ^df'^dxipits^^ov^h a low-level, infinite impulse 
v.j£esppnse jGlter>;v ; j r?*^ - . . * ... 

Decomposes -signals 'i&to wavelet series. 



ReconstructWavelet *"* RecJristnicts signals frbiai wavelet decomposition. 
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Discrete Cosine Transform Function 

DCT Performs the Discrete Cosine Transform (DCT). 

5 Vector Data Conversion Functions 

All the functions described in this section can operate on a number of different data 
formats (such as various integer lengths, different floating point formats and fixed point 
representations of floating point numbers). The Signal Processing Library will contain 
methods to translate single values and vectors between all pairs of formats supported. 

10 .......... 



15 
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Claims 

1. A digital wireless communications basestation programmed with a virtual 
machine layer appropriate to baseband signal processing. 

5 ,2. Th$ basestation of Claim 1 in which the. virtual machine layer is suitable for 
enabling one or more baseband processing algorithms. to be represented using high level 
software. . .. . . . v ... 

3. The basestation of Claim 1 in which the virtual machine layer runs on hardware 
10 comprising a PCI-bus backplane. 

4. The basestation of Claim 1 in which the hardware elements within the virtual 
machine communicate using an open, architecture neutral messaging system. 

15 5. The basestation of Claim 4 in which 120 compliant messaging is used. 

6. The basestation of Claim 1 which can change from operating one set of baseband 
processing algorithms to another set solely through a change in software. 

20 7. The basestation of Claim 6 which can change from operating one set of baseband 
processing algorithms to another set solely by changes to the underlying engines, 
implemented in soft datapaths, or hard datapaths, or a combination of the two. 

8. The basestation of Claim 1 which connects to RF elements through an interface 
25 which is an open interface. 
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9; .- . The basestation of Claim 6 in which the open interface defines one or more of 
the following components: 

(i) power feed; 

(ii) . data; 

(iii) controls; 

(iv) timing/ synchronisation; 

(v) status. 



10. The basestation of Claim 1 which sends an IP-based digital IF feed to a radio 
10 mast. 



11. The basestation of Claim 1 0 in which the IP feed is fed up to multiple RF 



units. 



12. The basestation of Claim 1 in which an IP feed derived from a signal received at 
15 the mast is passed down to multiple processor boards. 



13. The basestation of Claim 1 comprising a scheduler programmed to allow scalable 
processing using; multiple parallel processing nodes. — 



20 14. The basestation of Claim 13 in which the scheduler uses 120 based self-discovery 
of resources to enable it to exploit those resources in an optimal manner. 



' 15. ■ The basestation of Claim' 13 in which the scheduler reads an 'a priori' partitioning 
file to -'help 1 shape its decisions' about which datapaths ought to execute on which 
25 processing units. 
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16. The basestation of Claim X opefableto simultaneouslyrmn multiple standards. 

17. The basestation of Claim 1 in which the' virtual machine layer supports 
underlying high MIPs algorithms common to a number of different baseband processing 
algorithms, and makes these accessible to high level, architecture neutral, potentially high 
complexity but low-MIPs control flows through a scheduler interface, which allows the 
control flow to specify the algorithm to be executed, together with a set of resource 
constraint envelopes, relating to one or more of: time 6f execution, memory, 
interconnect bandwidth, inside of which the caller desires the execution to take place. 

18. The basestation of Claim 1 in which the virtual machine layer is software- 
designed to be portable to one or more DSP architectures, one or more FPGA 
architectures, and / or one or more ASIC architectures. 

19. The basestation of Claim 1 in which the virtual machine layer is software 
programmed with various core processes and/ or core structures and/or core functions 
and/ or flow control and/ or state management. 

20. The basestation of Claim 19 in which the core processes include algorithms to 
perform one or more of the following: source coding, channel" coding, modulation; or 
their inverses, namely source decoding, channel decoding and demodulation. 

21. The basestation of Claim 19 or 20 in which the core structures comprise a 
symbol processing section (concerned with processing full symbols, regardless of 
whether aU : die .informal is to be used) and a data directed 
processing section, in wtach oi^ hold:, relevant information are 
processed. r . . _ u , ; 
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22. Tha ba^pn.pf Oata 2, tewhicU symbol rate proc „ smg compns(!s ^ 
processing within CDMA systems. •. 



23. 



The basestation of Claim 21 in which the core structure is comprised of 
processing modules operable to allocate, share and dispose, of intermediate, aligned 
memory buffers, and pass events between themselves. 



10 



15 



20 



25 



24. The basestation of Claims 19 in which the core functions include one or more of 
the following: resource allocation and scheduling, including memory allocation, real rime 
resource allocation and concurrency, management. 



25. 



26 



The basestation of any preceding Claims 19 operable to access PC debug tools. 



The basestation of any preceding Claim 19 which is operable with a component 
in which only that iriformation necessary to enable software to operate with and/or 
otherwise model the performance of the component is supplied by the owner of the 
intellectual property in the component. 



; r v.- 



27. The basestation of any preceding Claim 19 which is operable with a standardised 
description of the characteristics (including interface and non-interface behaviour) of 
communications components . to: enable a simulator, emulator or modelling tool to 
accurately estimate the ^ resource. requirement,. of * system using those components even 
when such components are distributed in a non-symmetric access architecture, and even 
where the pattern of use of the components can only be statistically, not deterministically 
modelled, due to factors such as inherent 'burstiness' of the I underlying data stream, or 
the use of multiple streams each with its own QoS and birm-deam tin^ings. 
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- 28.. : c - The basestation of Gl&m v l 9 operable to model time^ CPU, memory, scheduling 
and concurrency restraints, enabling mapping, onto a "real -time OS', hon real-time OS, 
virtual machine or hardware. 

5 29. . A- baseband stack forming the baseband stack of a basestation as defined in any 
preceding Claim 1 - 28. - 

30. , A baseband stack as claimediifc Claim 29 in which real or simulated components 
are linked together in a pipeline using a number of standard connection' types and 

10 synchronisation methods which enable, the management of the pipeline to be determined 
by the data itself. 

31. A design tool for simulating the baseband stack of Claim 29 or 30, in which the 
design tool can link together software and hardware components using a number of 

15 standard 'connection types and synchronisation methods which enable the management 
of the pipeline to be determined by die data processed by the data flows. 

32. The design tool as claimed in Claim 31 in which at least some of the high level 
flows are specified in a procedural language such as C, C++. 

20 ( ' . . 7r j'.,_;. . . ... 0 .. , .... 

33. j , The design, topi as .qlaimed in Claim ^-3 line. which- at least some- of 'the high level 
flows are specified in* a stat£-machii*e language such: as. SDL. > . . -' * • ; . • 

34. A method of designing part or all of a digital wireless basestation device ixi which 
25 the step of using software programmed with a virtual machine layer appropriate to 

baseband signal processing. 
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35. Computer software suitable for a digital wireless basestation, the software 
operating as a hardware abstraction layer and enabling one or more baseband processing 
algorithms to be represented using high level software. 



5 36. Computer software as claimed in Claim 35 in which the basestation is a 
basestation as claimed in Claims 1 - 28. 



37. Computer hardware programmed with the computer software of Claims 35 - 36. 

10 38. RF elements suitable for connection to a digital radio basestation, in which the 
* basestation is as claimed in any of Claims 1 - 28. 



15 
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